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The  analysis  of  choice  behavior  has  concerned  many  students  of  social 
science.  Choices  among  political  candidates,  market  products,  investment 
plans,  transportation  modes  and  professional  careers,  have  been  investigated 
by  economists,  political  scientists  and  psychologists  using  a  variety  of 
empirical  and  theoretical  methods.  An  examination  of  the  empirical  litera¬ 
ture  indicates  that  choice  behavior  is  often  inconsistent,  hierarchical, 
and  context  dependent. 

Inconsistency  refers  to  the  observation  that  people  sometimes  make  different 
choices  under  seemingly  identical  conditions.  Although  inconsistency  can  be 
explained  as  the  result  of  learning,  satiation,  or  change  in  taste  ,  it 
tends  to  persist  even  when  the  effects  of  these  factors  are  controlled  or 
minimized.  Furthermore,  even  in  an  essentially  unique  choice  situation,  which 
cannot  be  replicated,  people  often  experience  doubt  regarding  their  decisions, 
and  feel  that  in  a  different  state  of  mind  they  might  have  made  a  different 
choice.  The  observed  inconsistency  and  the  experienced  uncertainty  associated 
with  choice  behavior  have  led  several  investigators  to  conceptualize  choice  as 
a  probabilistic  process,  and  to  use  the  concept  of  choice  probability  as  a 
basis  for  the  measurement  of  strength  of  preference.  (Thur stone,  1927:  Luce, 
1959;  Marschak,  1960). 

Choice  among  many  alternatives  appears  to  follow  a  hierarchical  elimination 
process.  When  faced  with  many  alternatives  (e.g.,  job  offers,  houses,  cars) 
people  appear  to  eliminate  various  subsets  of  alternatives  sequentially 
according  to  some  hierarchical  structure,  rather  than  scanning  all  the 
options  in  an  exhaustive  manner.  This  strategy  is  particularly  appealing  when 
the  number  of  alternatives  is  large  and  an  exhaustive  evaluation  is  either  not 
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feasible  or  very  costly  in  time  and  effort.  Indeed,  these  considerations  have 
led  several  theorists,  notably  Simon  (1957),  to  modify  the  classical  criterion 
of  maximization,  and  to  view  the  choice  process  as  a  search  for  an  acceptable 
alternative  that  satisfies  certain  criteria.  Such  a  search  is  naturally 
executed  by  a  sequential  elimination  procedure. 

Choice  behavior  appears  to  be  context  dependent.  That  is,  the  strength 
of  preference  of  x  over  y  depends  on  the  context  of  the  other  available  alterna¬ 
tives.  Furthermore,  choice  probability  depends  not  only  on  the  values  of  the 
alternatives,  but  also  on  their  similaritv  or  comparability,  see,  e.g.,  Tverskv 
(1972  a).  An  analysis  of  the  structural  relations  among  the  alternatives,  there¬ 
fore,  is  an  essential  element  of  any  theory  which  purports  to  explain  the  effects 
of  similarity  and  context  on  choice. 

The  present  paper  develops  a  probabilistic,  context-dependent  choice  model — 
called  preference  tree — based  on  a  hierarchical  elimination  process.  The  first 
part  of  the  paper  illustrates  the  tree  model  and  investigates  its  formal 
properties  and  their  psychological  significance.  In  the  second  part  of  the  paper 
the  model  is  applied  to  several  sets  of  choice  data  that  are  represented  as 
preference  trees.  The  problem  of  constrained  choice  is  Investigated  in  the  third 
section  and  the  Implications  of  the  tree  model  are  discussed  in  the  last  section. 

THEORY 

In  order  to  motivate  and  develop  the  theory  of  preference  trees,  we 
discuss  first  the  more  general  model  of  elimination  by  aspects,  or  EBA. 

According  to  this  model  (Tversky,  1972a,  b)  each  alternative  is  viewed  as  a 
collection  of  measurable  aspects,  and  choice  is  described  as  a  covert  process 
of  eliminations.  At  each  stage  in  the  process  one  selects  an  aspect  (from 
those  included  in  the  available  alternatives)  with  probability  that  is  propor¬ 
tional  to  its  measure.  The  selection  of  an  aspect  eliminates  all  the 
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alternatives  that  do  not  include  this  aspect,  and  the  process  continues 
until  only  a  single  alternative  remains.  Consider,  for  example,  the  choice 
of  a  restaurant  for  dinner.  The  first  aspect  selected  may  be  seafood; 
this  eliminates  all  restaurants  that  do  not  serve  acceptable  seafood. 

Given  the  remaining  alternatives  another  aspect,  say  a  price  level, is 
selected  and  all  alternatives  that  do  not  meet  this  criterion  are  elim¬ 
inated.  The  process  continues  until  only  one  restaurant — that  Includes 
all  the  selected  aspects — remains. 

In  order  to  characterize  this  process  in  formal  terms,  some  notation 

is  Introduced.  Let  T  ■  {x,y,z,...}  be  the  total  finite  set  of  alternatives 

under  study,  and  let  A,B,C,  denote  nonempty  subsets  of  T.  Let  P(x,A)  be 

the  probability  of  choosing  alternative  x  from  an  offered  set  A.  Naturally 

EP(x,A)  ■  1  for  all  ACT,  and  P(x,A)  ■  0  for  x^A.  For  simplicity, 
xcA 

we  write  P(x,y)  for  P(x,{x,y}).  Choice  probabilities  are  typically 
estimated  from  relative  frequency  of  selecting  x  on  repeated  choices  from  A. 
Next,  consider  a  mapping  that  associates  with  each  x  in  T  a  finite  nonempty 
x'  ■  (a, 3 , . . . }  of  elements  which  are  interpreted  as  the  aspects  of  x.  An 
alternative  x  is  said  to  include  an  aspect  a  whenever  a  is  an  element  of 
x' .  The  present  theory  represents  choice  alternatives  as  collections  of 
aspects  which  denote  all  valued  attributes  of  the  options  including  quan¬ 
titative  attributed  (e.g.,  price,  quality)  and  nominal  attributes  (e.g., 
automatic  transmission  on  a  car,  or  fried  rice  on  a  menu).  The  present 
analysis,  however,  does  not  require  prior  identification  of  the  aspects 
associated  with  each  alternative. 

For  any  subset  A  of  T,  let  A'  be  the  set  of  aspects  that  belong  to  at 
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least  one  alternative  In  A,  i.e.,  A'  -  {ot|aex'  for  some  xcA).  In  particu¬ 
lar,  T'  is  the  family  of  all  aspects  under  consideration.  For  any  a  in  T', 
let  Aa«  (xeA|aex'}  denote  the  set  of  all  alternatives  of  A  that  in¬ 

clude  a.  Note  that  A'  is  a  set  of  aspects  while  Aq  is  a  set  of  alternatives. 
Using  these  constructs,  the  EBA  model  can  now  be  defined  as  follows. 

A  family  of  choice  probabilities  P(x,A),xeAcT  ,  satisfies  EBA  if  there 
exists  a  non-negative  scale  u  defined  on  T'  such  that  for  all  xcACT 

1  u(a)P(x,Aa) 

(1)  P(x,A)  ■  acx* _ 

E  u(8) 

6eA' 

This  recursive  formula,  which  defines  the  EBA  model,  expresses  the  pro¬ 
bability  of  choosing  x  from  A  as  a  weighted  sum  of  the  probabilities  P(x,Aa) 
of  choosing  x  from  proper  subsets  of  A.  It  is  easy  to  show  chat  aspects 
which  are  common  to  all  the  alternatives  under  consideration  do  not  affect 
choice  probability  and  can,  therefore,  be  discarded. 


Insert  Figure  1  here 


To  illustrate  the  model .consider  the  case  of  three  alternatives  where 
A  -  {x,y,z} ,  and  let  x’  *  {a,6,5,X},  v*  -  {S.P.u.U,  and  z'  -  {y  ,5,u,X}, 
see  Figure  1.  Thus,  A^  ■  {x),  A^  -  {x,v},  A^  -  (y.,z),  A^  -  {x,y,z},etc. 
Discarding  X  which  is  shared  by  all  alternatives  and  normalizing  the  scale 
u  such  that  u(a)  +  u(3)  +  u(y)  +  u(5  )  +  u(6)  +  u(y)  ■  1  yields 
P(x,A)  -  u(a)P(x,A^)  +  u(9)P(x,A?)  +  u(5  )P(x,A^ ) 

-  u(a)  +  u(6)P(x,y)  +  u(5)P(x,z),  where 
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p,  v  _ u(a)  +  u(6) _  _ u(x'-y' ) _ 

X,y  u(a)  +  u(0)  +  u05)  +  u(u)  "  u(x'-y')  +  u(y'-x') 

This  equation  for  binary  choice  probabilities  coincides  with  Restle's  (1961) 

model.  According  to  the  EBA  model,  x  can  be  chosen  from  A  (i)  if  a  is 

selected  first,  (ii)  if  9  is  selected  first  and  then  either  a  or  5  are 


selected  later,  (iii)  if  5  is  selected  first  and  then  either  a  or  0  are 


selected  later.  The  probability  of  choosing  x  from  A,  therefore,  is  the 
sum  of  the  probabilities  associated  with  these  outcomes. 


Since  there  may  be  many  aspects  that  are  unique  to  x  or  common  to  x 
and  y  only,  a,  9,  etc.  should  be  interpreted  as  collections  of  aspects. 
However,  for  the  purposes  of  the  present  treatment  it  is  possible  to  com¬ 
bine,  say  all  the  aspects  that  are  unique  to  x,  and  treat  them  as  a  single 
aspect.  Formally,  for  any  nonempty  proper  subset  A  of  T  let  A  ■  {a|aex'for  all 
xeA  and  a^y' for  any  yeT-A}.  Thus,  A  is  the  set  of  aspects  shared  by  all 
alternatives  of  A  that  are  not  shared  by  any  alternative  in  T-A,  and 
{A|AnT^  T,$}  is  a  partition  of  the  set  of  all  aspects  into  2n  -2  aspect 
sets.  To  avoid  additional  notation  we  use  a,3  ,  etc. to  denote  these  aspect 
sets  and  supress  the  distinction  between  individual  aspects  and  collections 
of  aspects. 

If  all  pairs  of  distinct  alternatives  in  T  are  aspect-wise  disjoint, 
i.e.,  x’ny*  is  null,  then  PCx.A^)  »  1  for  any  a  in  x',  hence  Equation  (1) 
reduces  to 


(2) 


Eu(a) 

p/  . \  _  aex'  u(x)  where  u(x)  -  E  u(a) 

'  ’ '  Eu(8 )  £u(y)  aex' 

SeA'  yeA 
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This  Is  the  choice  model  developed  by  Luce  (1959,  1977).  When  all  choice 
probabilities  are  nonzero,  Luce's  model  is  equivalent  to  the  assumption 
that  the  ratio  P(x,A)/P(y , A)  is  a  constant  which  depends  on  x  and  y  but 
not  on  the  offered  set  A.  Hence,  it  is  called  the  constant-ratio  model, 
abbreviated  CRM.  This  model  is  simple  and  parsimonious;  it  expresses  all 
probabilities  of  choice  among  n  alternatives  in  terms  of  n  scale  values. 
(Since  the  unit  of  measurement  is  arbitrary,  the  number  of  independent 
parameters  to  be  estimated  is  one  less  the  number  of  scale  values).  The 
constant-ratio  model,  however,  fails  to  account  for  the  effects  of  sim¬ 
ilarity  between  alternatives  on  choice  probability,  as  shown  by  several 
authors,  e.g.,  Debreu  (1960),  Luce  and  Suppes  (1965),  Restle  (1961), 
Rumelhart  and  Creeno  (1971),  Tverskv  (1972  a).  The  relevant  experimental 
studies  were  reviewed  by  Luce  (1977). 

In  contrast,  ESA  provides  a  natural  explanation  of  the  similarity 
effect.  Furthermore,  it  has  several  testable  consequences  that  impose 
considerable  constraints  on  observed  choice  probabilities  and  permit  a 
measurement-free  test  of  a  model.  The  EBA  model,  however,  does  not  restrict 
the  structure  of  the  aspects  in  any  way,  and  hence  it  yields  a  large  num¬ 
ber  of  scale  values  (2n  -  2)  which  limits  its  use  as  a  scaling  model.  In 
particular.  EBA  cannot  be  estimated  from  binary  choice  probabilities 
since  the  number  of  parameters  exceeds  the  number  of  data  points.  The 
question  arises  then  whether  EBA  can  be  significantly  simplified  bv  im¬ 
posing  some  structure  on  the  set  of  aspects.  Stated  differently,  can  we 
formulate  an  adequate  theory  of  choice  that  is  less  restrictive  than  CRM 
and  more  parsimonious  than  EBA?  We  can  view  CRM  as  the  set-theoretical 


' — 23T 
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analogue  of  a  unidimensional  representation  and  ERA  as  the  counterpart  of 
a  high  dimensional  representation.  What  then  is  the  analog  of  low 
dimensionality  in  a  set-theoretical  representation? 

In  this  paper  we  Investigate  the  representation  of  choice  alternatives 
as  a  tree-like  graph.  A  graph  is  a  collection  of  points,  called  nodes, 
some  of  which  are  linked  directly  by  lines  called  edges  or  links.  A 
sequence  of  adjacent  links  with  no  repetitions  is  called  a  path.  A  (rooted) 
tree  is  a  connected  graph  without  cycles  containing  a  distinguished  node 
called  the  root.  Thus,  any  two  nodes  in  a  tree  are  joined  by  a  path,  and 
no  path  starts  and  ends  at  the  same  node.  For  ease  of  reference,  we  place 
the  root  at  the  top  of  the  tree  and  the  terminal  nodes  at  the  bottom  as  in 
Figure  2.  To  Interpret  a  rooted  tree  as  a  family  of  aspect  sets,  we 
associate  each  terminal  node  of  the  tree  with  a  single  alternative  in  T, 
and  each  link  of  the  tree  with  the  set  of  aspects  that  are  shared  by  all 
the  alternatives  which  Include  (or  follow  from)  that  link  and  are  not 
shared  by  any  of  the  alternatives  which  do  not  include  that  link.  Naturally 
the  length  of  each  link  in  the  tree  represents  the  measure  of  the  respec¬ 
tive  set  of  aspects.  Hence,  the  set  of  all  aspects  that  belong  to  a  given 
alternative,  is  represented  bv  the  path  from  the  root  of  the  tree  to  the 
terminal  node  associated  with  the  alternative,  and  the  length  of  the  path 
represents  the  overall  measure  of  the  alternative. 


Insert  Figure  2  here 
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An  illustrative  example  of  a  rr^e  representation  of  a  menu  is  presented 
in  Figure  2.  The  set  of  alternatives  consists  of  five  entrees:  steak, 
roast  beef,  lamb,  sole  and  trout,  that  appear  as  the  terminal  nodes  of 
the  tree.  Thus,  the  link  labelled  \  represents  the  aspects  shared  bv  all 
meat  entrees  but  not  fish,  6  represents  the  aspects  shared  by  steak  and 
roast  beef  but  not  lamb  or  fish,  and  y  represents  the  unique  aspects  of 
lamb.  The  names  of  the  alternatives  are  displayed  vertically  and  the 
suggested  labels  of  the  clusters  (defined  t>y  the  links)  are  displayed 
horizontally. 

A  tree  representation  imposes  considerable  constraints  on  the  family 

T*  ■(x'ixcT)  of  aspect  sets  associated  with  a  given  set  of  alternatives. 

In  particular,  a  tree  defines  a  hierarchical  structure  on  the  alternatives 

in  T  induced  by  associating  each  link  a  of  the  tree  with  the  set 

T  ■  (xeTiaex'}  of  all  alternatives  that  include,  or  follow  from,  that  link. 

01 

In  Figure  2,  for  example,  T  *  (sole,  trout)  and  T  »  (steak).  It  is 

u  a 

easv  to  verify  that  for  anv  two  links  a,3  in  a  tree, either  TpT,  or 

’  a  3 

T3T  or  T  nT,  is  empty.  The  constraints  implied  bv  the  tree  greatlv 

d  J  do 

restrict  the  structure  under  consideration  and  drastically  reduce  the 
number  of  parameters  from  2°  -  2  (the  number  of  proper  nonempty  subjets  of  T) 
to  2n  -2  that  corresponds  to  the  maximal  number  of  links  in  a  tree  with 
n  terminal  nides.  To  appreciate  the  nature  of  the  constraints,  note  that 
the  paths  whic  connect  any  three  terminal  nodes  with  the  root  either 
all  meet  at  the  same  node,  or  two  paths  join  at  one  node  while  the  third 
path  joins  them  at  a  higher  node,  i.e.,  one  that  is  closer  to  the  root. 

In  Figure  2,  for  example,  'steak'  and  'roast  beef  join  first  and  then 
lamb  joins  them  later. 
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This  property  of  trees  implies  the  following  inclusion  rule:  for 
all  x,y,z  in  T  .either  x'oy'ox'nz'  or  x’fl  r’ax'oy' .  That  is,  one  out  of  any 
two  binary  intersections  of  three  alternatives  include  the  other.  Equiva¬ 
lently,  any  subset  of  T  with  three  elements  contains  one  alternative,  say 
z,  such  that  z'Ox'  ■  z'/ly'  which,  in  turn  ,  is  included  in  x'nv'. 

We  denote  this  relation  bv  (x,v)z,  with  or  without  a  comma.  Thus,  the 
tree  in  Figure  2  is  described  as  ((steak,  roast-beef )lamb)  (sole,  trout). 
Figure  3a  illustrates  the  inclusion  rule  by  a  Venn  diagram,  and  Figure 
3b  displays  the  corresponding  tree. 


Insert  Figures  3a  and  3b  here 


A  comparison  of  "igures  1  and  3a  reveals  that,  under  the  inclusion  rule, 
two  out  of  the  three  binary  intersections  coincide  with  the  triple  inter¬ 
section  (xV)z’  •  y 7)z’  »  x'ny'flz'),  hence  the  number  of  parameters  or 
aspect  sets  reduces  in  this  case  from  6  to  4  ,  excluding  A  that  represents 
the  aspects  shared  by  all  three  alternatives  .  The  following  elementary 
result;  proved  in  the  mathematical  appendix,  shows  that  the  inclusion 
rule  is  not  only  necessary  but  also  sufficient  for  representation  by  a  tree. 

STRUCTURE  THEOREM:  A  family  {x'|xeT}  of  aspect  sets  is  representable  by 
a  tree  iff  either  x'ny'ax'flz'  or  x'nz'mx’Oy'  for  all  x,y,z  in  T. 

When  the  family  {x'|xeT}  of  aspect  sets  satisfies  the  inclusion  rule, 
the  process  of  elimination-by-aspects  reduces  to  elimination-by-tree, 
or  EBT  for  short.  That  is,  one  selects  a  link  from  the  tree  (with  probability 
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that  is  proportional  to  its  length)  and  then  eliminates  all  the  alterna¬ 
tives  that  do  not  include  the  selected  link.  The  same  process  is  then 
applied  to  the  selected  branch,  until  only  one  alternative  remains.  In 
Figure  3,  for  example,  P(x,{x,y,z})  »  u(a)  +  u(9)u(a)/ (u(a)  +  u(B)),  and 
P(z,{x,y,z))  ■  u(y) ,  assuming  the  measure  u  is  normalized  so  that 
u(a)  +  u(j3)  +  u(y)  +  u(9)  *  1.  Elimination  by  tree,  then,  is  simply  the 
application  of  elimination  by  aspects  to  a  tree  structure.  Note  that  CRM 
corresponds  to  a  degenerate  tree,  or  a  bush,  with  only  one  internal  node  - 
the  root. 

Hierarchical  Elimination 

The  representation  of  choice  alternatives  as  a  tree  suggests  an 
alternative  decision  model  in  which  the  tree  is  viewed  as  a  hierarchy  of 
choice  points.*'  This  theory,  called  the  hierarchical  elimination  model  or 
HEM,  can  be  described  as  follows.  One  begins  at  the  top  of  the  tree  and 
selects  first  among  the  major  branches,  or  the  links  that  follow  directly 
from  the  root.  One  then  proceeds  to  the  next  choice  point  at  the  bottom  of 
the  selected  link,  and  the  process  is  repeated  until  the  chosen  branch 
contains  a  single  alternative.  The  probability  of  choosing  an  alternative 
x  from  an  offered  set  A  is  the  product  of  the  probabilities  of  selecting 
the  branches  containing  x  at  each  stage  of  the  process,  and  the  probability 
of  selecting  a  branch  is  proportional  to  its  overall  weight.  For  example, 
the  probability  of  choosing  trout  from  the  choice  set  presented  in 
Figure  2  equals  the  probability  of  selecting  fish  over  meat  multiplied 

by  the  probability  of  choosing  trout  over  sole.  Thus,  each  node  in  the 
tree  is  treated  as  a  choice  point,  and  one  proceeds  in  order  form  the  top 
to  the  bottom  of  the  hierarchy. 
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To  define  Che  hierarchical  elimination  model  in  a  more  formal  manner, 
lec  Aq  denote  the  set  of  alternatives  in  A  that  include  the  link  a, 
i.e.,  A^-  {xeA| aex' }  ,  Define  a|9  if  8  follows  directly  from  a  ,  i.e., 
A^ZD^j,  and  A^HD^  Implies  A^IDA^.  Let  u(a)  denote  the  length 
of  a,  and  let  m(a)  be  the  measure,  or  the  total  length,  of  all  the 

links  that  follow  from  a,  including  a  .  In  Figure  3b.  for  example,  e|a, 

9  |S  ,  and  m(0)  ■  u(a)  +  ufe  )  +  u(0).  If  T*  is  a  tree  and  ACT,  A*-{x'|xeA} 
is  also  a  tree  that  is  referred  to  as  a  subtree  of  T.  Naturally,  the 

relation  |  and  the  measure  u  on  T*  induce  corresponding  relations  and 
measures  on  A*.  Finally,  for  BOA,  let  P(B,A)  denote  the  probability  that 
the  alternative  selected  from  A  is  also  an  element  of  B,  i.e.,  P(B,A)  - 

l  P(x,A) . 

XE  B 

A  family  of  choice  probabilities  P(x,A),  xeAcT,  is  said  to  satisfy  HEM  if 
there  exists  a  tree  T*,  with  a  measure  u,  such  that  the  following  three 
conditions  hold 

(a)  if  y|8  and  3  |a  then  PCA^.A^)  ■  P(A^  )P(^  ,A^ ) 

(3)  (b)  if  y IS  and  y|a  then  P^Aa,A>^  _  ,  provided  P(A  ,A  )  +  0. 

P(^  ,Ay)  "  m«  )  S  v 

(c)  the  above  conditions  also  hold  for  any  subtree  A*  of  T*.  with  the 
induced  structure  on  A*. 

The  first  condition  implies  that  the  probabilitv  of  selecting  x,  say, 
from  T  is  the  product  of  the  probabilities  of  selecting  the  branches  that 
contain  x  at  each  junction.  This  condition  is  readily  testable  since  it  is 
formulated  directly  in  terms  of  choice  probability,  with  no  reference  to 
the  scale  u.  The  second  condition  states  that  the  probabilities  of  selecting 
one  branch  rather  than  another  at  a  given  junction  are  proportional  to  the 
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weights  of  the  respective  branches  —  defined  as  the  total  length  of  all 
their  links.  If  we  view  each  Junction  as  a  pan  balance  and  the  weight  of 
each  subtree  as  mass,  then  (b)  can  be  interpreted  as  a  weighing  process 
where  the  probability  of  choice  among  subtrees  is  proportional  to  their 
mass.  The  third  condition  ensures  that  (a)  and  (b)  applv  not  only  to  the 
entire  tree,  but  also  to  anv  subtree  obtained  bv  deleting  alternatives 
from  T.  Note  that  the  above  definition  of  HEM,  like  the  definition  of 
EBA,  excludes  in  effect  the  presence  of  identical  alternatives.  Thus,  we 
assume  that  anv  two  alternatives  have  some  distinctive  aspects  with 
a  nonzero  measure,  however  email. 

The  notion  of  hierarchical  elimination  and  the  idea  of  elimination-bv-tree 
represent  different  conceptions  of  the  choice  process  that  assume  a  tree 
structure.  EBT  describes  P(x,A)  as  a  weighted  sum  of  the  prob¬ 
abilities  P(x,A  )  of  selecting  x  from  the  various  subsets  of  A.  In  HEM 
a 

on  the  other  hand,  P(x,A)is  expressed  as  a  product  of  the  probabilities 
P ( A ^ , A , ) ,  J|.s,  of  selecting  a  subtree  containing  x  at  each  level  in  the  hierarchy. 
Compare,  for  example,  the  two  formulas  for  the  probability  of  choosing  steak  from 
the  set  of  entrees  T  displayed  in  Figure  2.  To  simplify  the  notation  wo  suppress 
the  scale  u  and  write  a  for  u(ci),  etc.  Furthermore,  the  scale  is  normalized  so 


that  n+t*  +>  M  +pf  ,'f\ -Hi  ■  l.  According  to  EBT,  then 


PCSteak,  T)  -  a  +6  (^-)+X  (jsjgqbfe  +  (oWy+9  _)  X  (o*H) 


whereas  according  to  HEM 

P(Steak,  T)  -  (a+3+y+e+X)  x  )  *  (-^-) 


The  difference  in  form  reflects  a  difference  in  processing  strategy.  EBT  assumes 
free  access;  that  is,  each  aspect  can  be  selected  (as  a  basis  for 
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elimination)  at  any  stage  of  the  process.  On  the  other  hand,  HQ1  assumes 
sequential  access;  that  is,  aspects  are  considered  in  a  fixed  hierarchical 
fashion.  The  contrast  between  models  based  on  random  and  on  sequential 
access  can  also  be  found  in  theoretical  analyses  of  memory  and  pattern 
recognition. 

I t  would  appear  chat  EBT  is  applicable  to  decisions,  such  as  the 
selection  of  a  restaurant  or  the  choice  of  a  movie  where  there  is  no  fixed 
sequence  of  choice  points,  whereas  HEM  seems  appropriate  for  decisions  that 
induce  a  natural  hierarchy  of  choice  points.  A  student  who  has  to  decide 
what  to  do  after  graduation,  for  example,  is  more  likely  to  consider  the 
alternatives  in  a  hierarchical  manner.  She  may  first  decide  whether  to  go 
to  graduate  school,  travel,  or  take  a  job.  And  she  may  not  evaluate  in 
detail  the  available  graduate  schools,  travel  plans,  or  job  opportunities, 
before  the  initial  decision  is  resolved.  The  preceding  discussion  suggests 
that  EBT  and  HEM  capture  different  decision  strategies  that  might  be 
followed  in  different  situations.  However,  the  following  theorem  establishes 
a  rather  surprising  result  that,  despite  the  difference  in  mathematical 
form  and  psychological  interpretation,  the  two  models  are  actually  equivalent. 

EQUIVALENCE  THEOREM:  EBT  and  HEM  are  equivalent.  That  is,  any  set  of 
choice  probabilities  satisfies  one  model  iff  it  satisfies  the  other. 

The  proof  of  the  Equivalence  Theorem  is  given  in  Section  II  of  the  Appendix. 

It  shows  that,  given  a  tree  T*  with  a  measure  u,  EBT  and  HEM  yield 
identical  choice  probabilities  and  hence  it  is  impossible  to  discriminate 
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between  these  strategies  on  the  basis  of  these  data  alone.  It  might  be 
possible,  however,  that  other  data  such  as  verbal  protocols,  reaction  time 
or  eye  movements  can  be  used  to  distinguish  between,  the  two  strategies.  To 
avoid  confusion,  we  shall  use  the  term  'preference  tree'  or  ’Pretree ’  to 
denote  the  choice  probabilities  generated  by  EBT  or  by  HEM,  irrespective 
of  the  particular  strategy. 

An  immediate  corollary  of  the  equivalence  of  EBT  and  HEM  is  that  any 
alternating  strategy  consisting  of  a  mixture  of  EBT  and  HEM  is  also  equivalent 
to  them.  For  example,  a  person  may  choose  a  restaurant  according  to  EBT  but 
select  an  entree  according  to  HEM,  or  vice  versa.  It  is  a  remarkable  fact 
that  all  the  various  strategies  obtained  by  alternating  EBT  and  HEM  yield 
identical  choice  probabilities.  Thus,  Pretree  provides  a  versatile  representation 
of  choice  that  is  compatible  with  both  random-access  and  sequential-access  strategies 
Consequences 

We  turn  now  to  discuss  general  properties  and  testable  consequences  of  the 
tree  model,  starting  with  the  similarity  effect.  There  are  two  distinct  ways 
in  which  the  similarity  between  alternatives  affect  choice  probability.  First, 
similarity , or  the  presence  of  common  aspects  creates  statistical  dependence 
among  alternatives.  If  x  has  more  in  common  with  y  than  with  z,  for  example, 
then  the  addition  of  x  to  the  set  {  z>y}  tends  to  hurt  the  similar  alternative  y 
more  than  the  less  similar  one  z.  In  the  extreme  case  where  x  is  almost 
identical  to  y;  the  addition  of  x  will  divide  the  probability  of  choosing  y 
by  two  while  leaving  the  probability  of  choosing  z  unchanged . 
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Second,  similarity  facilitates  comparison.  If  x  is  more  similar  to  y  than 
to  z,  and  P(y,z)  ■  1/2,  then  P(x,z)  will  be  less  extreme  chan  P(x,y), 
i.e.,  closer  to  1/2.  Thus,  the  more  similar  pair  generally  yields  a  more 
extreme  choice  probability  because  similarity  '.acilitates  the  comparison 
between  the  alternatives. 

To  illustrate  the  effects  of  similarity,  consider  a  hypothetical  example 
of  choice  among  transportation  modes.  Suppose  the  available  alternatives 
include  two  airlines  a  and  a  ,  and  two  trains  t  and  t0.  Suppose  further  that 
there  is  no  reason  to  prefer  one  airline  over  the  other,  but  one  train  has  a 
very  slight  but  clear  advantage  over  t^  since  it  makes  one  fewer  stop  along 
the  way.  Because  the  train  is  more  comfortable  hut  the  plane  is  faster  suppose 
one  is  undecided  as  to  whether  to  fly  or  take  a  train,  and  hence 

P<al ,a2 )  "  ^2*  P^t2,ci^  “  1  •  fl,uJ  -  P(a2,tj)  -  1/2. 

Let  P(x,xyz)  denote  P(x,{x,y,z))  ,  It  follows  at  once  from  CRM  that 

P(t^,t^a^a2)  ■  1/3.  Introspect  ion  suggests,  however,  that  the  selection  from 
(t^.a^.a^i  is  likely  to  be  viewed  as  a  choice  between  a  train  and  a  plane, 
whence  a^  and  an  are  treated  as  one  alternative  that  is  compared  with  .  Conse¬ 
quently,  P( t ^ ,  t^a^a., )  will  be  close  to  1/2,  while  the  two  other  trinary  choice 
probabilities  will  be  close  to  1/4.  The  commonality  between  a^  and  a2>  there¬ 
fore,  produces  a  statistical  dependence  which  increases  the  relative  advantage 
of  the  odd  alternative  tj. 

Furthermore,  CRM  implies  that  if  two  alternatives  are  equivalent 

in  one  context,  then  they  are  substitutable  in  any  context.  That  is,  it 

should  be  possible  to  substitute  one  for  the  other  without  changing  choice 

probability.  Since  Pfa^.t^l  "  1/2  and  P(t.,,tjl  *  l,  we  obtain  by  substitution 

P(t,,a  )  ■  1.  This  result,  however,  seems  implausible  because 
*  l 

L . . . . _ . _ 


the  slight 
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albeit  definite  advantage  of  t2  over  t^  is  not  likely  to  eliminate  all 
conflict  in  the  choice  between  t 2  and  a^.  PCt^.a^),  therefore,  is  expected 
to  be  significantly  smaller  than  one,  contrary  to  CRM.  Further  discussions 
of  this  problem,  originally  presented  by  Debreu  (1960),  can  be  found  in 
Luce  and  Suppes  (1965,  pp.  334-335)  and  Tversky  (1972  a,  pp.  282-284). 


Insert  Figure  4  here 


Figure  4  represents  the  above  example  as  a  preference  tree.  It  is 
easy  to  verify  that,  according  to  the  tree  model  with  a  *  8  and  9  +  a  -  5  , 
P(aj  .a^)  -  P(t1,a1)  =  P(t1,a2)  -  1/2,  r^.tj)  -  1,  but  P(t2,a9)  -  (y-W)/ 
(y+25  )  which  approaches  1/2  as  y  approaches  0.  Furthermore,  P(t^  ,tj  a^,,) 
*5/(25  +  a)  which  approaches  1/2  as  a  approaches  0.  Hence  the  tree  model 
provides  a  simple  and  parsimonious  account  of  the  similarity  effects  that 
are  incompatible  with  CRM. 

The  effects  of  similarity  on  choice  probability  can  also  be  explained 

by  a  Thurstonian  or  a  random  utility  model  such  as  the  additive  random  aspect 

model  (Tversky,  1972b).  In  this  development  each  aspect  a  is  represented 

by  a  random  variable  ,  each  x  in  T  is  represented  bv  the  random  variable 

V  *  Z  ,  V  and,  following  the  random  utilitv  model,  ?(x,A)  eouals 
x  aex  a  7 

P(Vx  JV  for  all  yeA) .  This  model,  like  EBA,  accounts  for  the  observed 
dependence  among  the  alternatives  in  terms  of  their  common  aspects  that 
produce  positive  correlations  among  the  respective  random  variables.  An 
additive  random  aspect  model  differs  from  the  present  development  in  that 
the  aspects  are  represented  by  random  variables  rather  than  by  constants. 


Preference  Trees 


19 

and  choice  is  described  as  a  comparison  of  sums  of  random  variables  rather 
than  as  a  sequential  elimination  process.  Nevertheless,  it  was  shown 
(Tversky,  1972b)  that  EBA,  and  hence  Pretree,  is  also  expressible  as  a 
random  utility  model,  though  not  necessarily  an  additive  one.  A  random 
utility  analog  of  the  tree  model,  developed  by  McFadden  (1978),  is 
discussed  later. 

The  following  testable  properties  were  derived  from  EBA  (see 
Tversky  1972a, b;  Sattath  and  Tversky  1976).  Since  EBT  is  a  special  case  of 
EBA,  these  properties  apply  to  the  tree  model  as  well. 

Moderate  Stochastic  Transitivity:  If  P(x,y)  >_  1/2  and  P(y,z)  >  1/2  then 
P(x,z)  >_min  (P(x,y),  P(y,z))). 

This  is  a  probabilistic  form  of  the  transitivity  assumption.  Note  that  the  tree 
model  does  not  entail  the  stronger  property  where  'min'  is  replaced  by  'max’ . 
Regularity:  P(x,A)  >  P(x,A(J  B) 

The  probability  of  selecting  x  from  a  given  offered  set  cannot  be  increased 
by  enlarging  that  set. 

The  Multiplicative  Inequality:  P(x,Af)  B)  >_P(x,A)P(x,B)  . 

The  probability  of  selecting  x  from  Al^B  is  at  least  as  large  as  the 
probability  of  choosing  x  from  both  A  and  B  in  two  independent  choices. 

The  properties  discussed  so  far  follow  from  the  general  EBA  model.  We 
turn  now  to  some  new  properties  of  binary  choice  probabilities  that  characterize 
the  tree  model.  To  simplify  the  exposition  we  introduce  the  probability 
ratio  R(x,y)  ■  P(x,y)/P(y,x) ,  and  restrict  the  discussion  to  the  case  where 
P(x,y)  i1  0  so  that  r(x,v)  is  always  well-defined.  The  results  can  be 
readily  extended  to  deal  with  choice  probabilities  that  eaual  (1  or  1. 
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Consider  first  the  case  of  three  alternatives,  and  note  that  any  subtree 
of  three  elements  has  the  form  portrayed  in  Figure  5,  except  for  the 
permutation  of  the  alternatives  and  the  possibility  of  vanishing  links.  We 
use  the  parentheses  notation  to  describe  the  structure  of  the  tree  ,  e.g., 
the  tree  in  Figure  5  is  described  by  (xv)z  and  the  tree  in  Figure  4  by 

(a^)  (tjt^) . 


Insert  Figure  5  here 


Using  the  notation  of  Figure  5  it  follows  at  once  that  R(x.v)  -  a/S 
is  more  extreme  (i.e.,  further  from  one)  than  Rfx.z)/R(v,z)  ■  (.i+8)/(S+6) . 
Hence  any  three  elements,  that  form  a  subtree  (xv)z,  satisfv  the  following 
trinary  condition. 

(4)  If  R(x,v)  >  1  then  R(x,v)  >  >  I, 

where  a  strict  ineciualitv  in  the  hvpothesis  implies  strict  ineoualities  in 
the  conclusion,  and  an  equality  in  the  hvpothesis  implies  equalities  in  the 
conclusion. 

The  trinary  condition  (4)  reflects  the  similarity  hypothesis  in  that 
the  commonality  between  alternatives  enhances  their  discr iminabil itv .  This 
is  seen  most  clearlv  in  the  case  where  3  '  0,  a  s  8 .  and  8+  8  •  v,  i.e., 
R(x,v)  >  1  and  R(y,z)  *  1,  see  Figure  5.  According  to  the  trinarv  condition 
R(x,v)  -  a/3  >  (a+8)/(3+e)  ■  R(x,z).  Although  y  and  z  are  pair-wise 
equivalent,  P(x,y)  exceeds  P(x,z)  because  x  shares  more  aspects  with  v 
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than  with  z.  Note  that  when  8  vanishes,  R(x,y)  -  R(x,z)/R(y,z)  as  required 
by  CRM.  In  this  case,  where  (xy)z,  (xz)y  and  (zy)x  all  hold  we  omit  the 
parentheses  altogether  and  write  xyz. 

Next,  let  us  consider  sets  of  four  alternatives.  It  is  easy  to  verify 
that,  up  to  permutations  of  alternatives,  any  subtree  of  four  elements  has 
one  of  the  two  forms  displayed  in  Figure  6,  including  degenerate  forms  with 
one  or  more  vanishing  links. 


Insert  Figure  6  here 


It  follows  readily  that  in  the  tree  (xy) (vw)  portrayed  in  Figure  6a 

R(x,v)  _  (a  +  9)/(y  +  X)  _  (a  +  0 ) / (5  +  X)  R(x,w) 

R(y,v)  (3  +  0)/(Y  ♦  X)  “  (S  +  0)/(5  +  X)  =  R(y,w) 

If  we  interpret  R(x,v)/R(y,v)  as  an  indirect  measure  of  preference  for  x  over  y, 
measured  relative  to  a  standard  v,  then  the  above  quarternary  condition  asserts 
that  this  measure  is  the  same  for  different  standards  (v  and  w)  provided  the 
pairs  (x,y)  and  (v,w)  belong  to  distinct  clusters. 

If  the  relation  among  the  four  alternatives  under  consideration  har,  the 
form  depicted  in  Figure  6b,  that  is  ((xy)v)w,  then  the  following  quarternary 
condition  holds. 

( 6 )  R(x,v)  -  R(y,v)  _  (q-3 )/y  _  (q*6~X )/>  _  R(x.v)  -  R(v,v) 

R ( x ,w )  -  R(y,w)  ( oi-6  i/5  (n+O-yl/5  R(x,w)  -  R(v,wl) 


Note  that  under  CRM  the  quarternary  conditions  hold  for  any  four  alternatives. 
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At  chis  point,  the  reader  may  suspect  that  the  consideration  of  more 
elaborate  tree  structures  involving  larger  sets  of  alternatives  will  yield 
additional  independent  consequences.  However,  the  following  theorem  shows 
that  the  trlnary  and  the  quarternary  conditions  are  not  only  necessary 
but  they  are  also  sufficient  to  ensure  the  representation  of  binary  choice 
probabilities  as  a  preference  tree. 

REPRESENTATION  THEOREM:  A  set  of  nonzero  binary  choice  probabilities 
satisfies  the  tree  model  with  a  given  structure  iff  the  trinary  (4)  and 
Che  quarternarv  (5  4  6)  conditions  are  satisfied  relative  to  that  structure. 

The  theorem  shows  that  if  Equations  (4),  (5)  and  (6)  are  satisfied  rela¬ 
tive  to  some  tree  structure,  then  there  exists  a  ratio  scale  u  defined  on 
that  structure  such  that 


P(x,y) 


u  (  X  *  —  y  *  ) 

u(x '  -y  '  )  +  ii(y  '-x ' 


or  R(x,y) 


u(x’-v') 


u  t  y  -x 


Recall  that  u(x'-y')  is  Che  measure  of  the  aspects  of  x  that  are  not 
included  in  y,  or  the  length  of  the  path  from  the  terminal  node  associated 
with  x  to  the  meeting  point  of  the  paths  from  x  and  v  to  the  root. 

The  proof  of  the  Representation  Theorem  is  presented  in  Section  III  of 


the  Appendix.  This  result  shows,  in  effect,  how  to  construct  a  preference 


tree  from  binary  choice  probabilities  whenever  the  necessary  conditions 
hold.  The  trinary  and  quarternary  conditions  are  readilv  testable — given 
any  specified  tree  structure.  Moreover  thev  can  be  used  to  determine  which 
structure,  if  any,  is  compatible  with  the  data.  Recall  that  at  least  one 
permutation  of  every  triple  must  satisfy  Equation  (4),  and  at  least  one 
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permutation  of  every  quadruple  must  satisfy  Equation  (5)  or  (6).  Hence,  by 
finding  the  appropriate  permutations  of  all  triples  and  quadruples,  any 
tree  structure  that  is  compatible  with  the  data  will  emerge.  Tf  <s  readily 
verified  that  the  scale  values  (i.e.,  the  length  of  the  links  associated 
with  a  particular  tree  structure)  are  uniquely  determined  up  to  an  arbitrary 
unit  of  measurement , except  when  allT  binary  choice  probabilities  are  one- 
half.  The  tree  structure,  however,  is  not  always  unique.  That  is,  a  given 
set  of  binary  choice  probabilities  could  be  compatible  with  more  than  one 
tree  structure.  An  example  of  this  kind  is  presented  in  Section  IV  of 
the  Appendix  along  with  a  proof  of  the  proposition  that  the  tree  structure 
is  uniquely  determined  hv  the  set  of  binary  and  trinary  choice  probabilities. 

Furthermore,  if  both  binary  and  trinary  choice  probabilities  are 
available,  they  must  satisfy  the  following  conditons.  Suppose  the  tree 
model  holds  with  (xy)z,  see  Figure  5,  then 


(7)  r(x,z) 
P(z,x) 


q+e  >  ft+6a/(q-tf )  .  P(x,xyz)  and 

y  ~  y  P(z,xyz) 


(8)  Pjx.y) 
P(y,x) 


m  q-fdot/  (a~h> ) 

3  ’  1  +a<  TutT) 


-  P(x,xyz) 
P (y ,xyz) 


provided  all  choice  probabilities  are  nonzero  Thus,  according  to  the 
tree  model  with  (xy)z,  the  constant-ratio  rule  (8)  holds  for  the  adjacent 
pair  (x,v)  but  not  for  the  split  pair  (x,z).  Note  that  this  rule  is 
violated  by  (7)  in  the  direction  implied  by  the  similarity  hypothesis  for 
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(xy)z.  Since  y  is  closer  to  x  than  to  z  in  that  structure  (  in  the  sense 
that  y'n  x'~  v'fi  z'),the  addition  of  y  to  the  set  {x,z>  reduces  the  prob¬ 
ability  of  choosing  x  proportionally  more  than  the  probability  of  choosing 
z.  On  the  other  hand,  since  z  is  equally  distant  from  x  and  from  y  (  in  the 
sense  that  x'  z’  =  y'ft  z')  the  addition  of  z  to  the  set  {x,y}  reduces  the 
probabilities  of  choosing  x  and  y  by  the  same  factor. 


;regate  Probabilities 


So  far,  we  have  modeled  the  process  by  which  an  individual  chooses  among 


alternatives.  Because  of  the  difficulties  in  obtaining  independent  repeated 


choices  from  the  same  individual,  most  available  data  consist  of  the  proportions 
of  individuals  who  selected  the  various  alternatives,  referred  to  as  group 


data  or  aggregate  probabilities.  It  should  be  emphasized  chat  these  data  do 
not  pertain  to  group  decision  making,  they  merely  characterize  the  aggregate 
preferences  of  different  individuals. 

It  is  well-known  that  most  probabilistic  models  for  individual  choice 
(including  CRM  and  EBA)  are  not  preserved  by  aggregation.  That  is,  group 


probabilities  could  violate  the  model  even  though  each  individual  satisfies 
it,  and  vice  versa.  Consider,  for  instance,  the  case  of  three  individuals 
l,  2,  3  and  three  alternatives  x,  y,  z.  Suppose  the  observed  choice  prob¬ 
abilities  P ( x , y ) ,  P(y , z )  and  P(z,x)  are,  respectively,  .75,  .75  and  .15 
for  individual  1;  .15,  .75  and  .75  for  individual  2;  and  .75,  .15  and  .75 
for  individual  3.  The  individual  choice  prob¬ 

abilities  all  satisfy  EBA,  but  the  expected  aggregate  probabilities  .55, 

.55  and  .55,  respectively,  violate  EBA.  Hence,  the  validity  of  EBA  as  a 
model  for  individual  choice  is  neither  necessary  nor  sufficient  for  its 
validity  as  an  aggregate  model.  Nevertheless,  we  contend  that  similar 
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principles  govern  both  types  of  choice  data,  and  propose  a  new  interpre¬ 
tation  of  EBA  as  an  aggregate  model. 

Suppose  each  individual  chooses  in  accord  with  the  following  sequent. 
elimination  rule,  i^iven  an  offered  set  A,  select  some  (nonempty)  subset  of 
A,  say  B,  and  eliminate  all  the  alternatives  that  do  not  belong  to  B. 

Repeat  the  process  until  the  selected  subset  consists  of  a  single  alternative. 

Let  Q  (B)  be  the  proportion  of  subjects  who  first  select  B  when  presented 

n 

with  the  offered  set  A,  i.e.,  the  proportion  of  subjects  who  eliminate  all 

elements  of  A-B  in  the  first  stage.  Naturally,  10^  (B^)  =  1? and  QA(A)  =  1 

B.  <=  A 

l 

iff  A  consists  of  a  single  alternative.  Note  that  Qa(B)  is  an  elimination 
probability — not  a  choice  probability.  The  two  constructs  are  related  via 
the  following  equation. 

(9)  P(x, A)  =  T.  Q  (  B .  )  P  (  x ,  B  .  ). 

B.CAA  1 

l 

Thus,  the  proportion  of  subjects  who  choose  x  from  A  is  obtained  by  summing, 
over  all  proper  subsets  B  of  A,  the  proportion  of  individuals  who  first  select  B 

multiplied  by  the  proportion  of  subjects  who  choose  x  from  the  selected  subset. 

This  general  elimination  model,  by  itself,  does  not  restrict  the  observed 

choice  probabilities  because  we  can  always  set  QA(B)  =  P(x,A)  if  B  =(x), 

and  Qa(B)  =  0  otherwise.  Nevertheless,  it  provides  a  method  for  characterizing 

probabilistic  choice  models  in  terms  of  the  constraints  they  imposed  on  the 

elimination  probabilities. 
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A  family  of  elimination  probabilities,  QA ( B ) ,  Be  AC.T,  satisfies 
proportionality  iff  for  all  A,  B,  C,  B^  ,  in  T, 

(10)  v»  _  rvv 

Q,(C)  1Q_(C.) 

A  T  j 

where  the  summations  range,  respectively,  over  all  subsets  B^,C^  T  such  that 


B.n  A  =  B  and  C.(l  A 
i  J 


C.  It  is  assumed  that  the  denominators  are  either 
both  positive  or  both  zero.  This  condition  implies  that,  for  any  AC.T, 


the  values  of  are  computable  from  the  values  of  Q^.  More  specifically, 
the  percentage  of  subjects  who  first  select  B,  when  presented  with  the 
offered  set  A,  is  proportional  to  the  percentage  of  subjects,  presented  with 
the  total  set  T,  who  first  select  any  subset  B^  that  includes  in  addition  to 
B  only  elements  that  do  not  belong  to  A. 

To  illustrate  the  proportionality  condition,  consider  the  choice 
among  entrees.  Let  T  *  {r,s,t}  and  A  =  {r,t},  where  r,  s  and  t  denote, 
respectively,  roast  beef,  steak  and  trout.  According  to  proportionality, 
therefore, 


QA(r)  0T(r)  +  QT(r,s) 

Qa( C )  "  QtU)  +  QtU,s) 

Note  that  in  the  binary  case,  where  A  =  {r,t},  0^(r)  =  P(r,A)  ”  P(r,t). 

The  rationale  behind  the  proportionality  condition  is  the  assumption 
that,  upon  restricting  the  offered  set  from  T  to  A,  all  individuals  who 
first  selected  Bl/C  from  T,  CCT-A,  will  now  select  B  from  A  since  the 
alternatives  of  C  are  no  longer  available.  For  example,  those  who  first 
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selected  (r,s)  from  T  will  select  roast  beef  when  restricted  to  A  because 
now  steak  Is  not  on  the  menu.  The  following  theorem  shows  that  the 
(aggregate)  process  described  above  is  compatible  with  EBA. 


AGGREGATION  THEOREM:  A  set  of  aggregate  choice  probabilities  on  T  are  com¬ 
patible  with  EBA  iff  there  exist  elimination  probabilities  on  T  that  satisfy 
Equations  (9)  and  (10). 

The  proof  of  this  theorem  is  readily  reduced  to  earlier  results,  see  the 
Appendix  in  Tversky  (1972a)  and  Theorem  2  in  Tversky  (1972b).  It  shows 
that  if  (9)  and  (10)  hold  then 


P  ( x ,  A ) 


L’Q(B.  )P(x,AnB.  ) 


where  Q(B^)  ■  Q^(B^),  anil  the  summations  range  over  all  B.CT  such  that 
BjO  A  is  nonempty.  This  form,  in  turn,  is  shown  to  he  equivalent  to  EBA. 

Hence,  the  Aggregation  Theorem  provides  a  new  interpretation  of  EBA  as  a  model 
for  group  data. 

It  is  instructive  to  compare  the  above  version  of  the  EHA  model  to  the 
original  version  defined  in  Equation  (1).  First,  note  that  the  scale  Q(Bl 
is  not  a  measure  of  the  overall  value  of  the  alternatives  of  B.  Rather,  it 
reflects  the  degree  to  which  they  form  a  good  cluster,  ns  evinced  by  the  pro¬ 
portion  of  subjects  who  first  selected  R  when  presented  with  T.  The  counter¬ 
part  of  Q(R)  in  the  original  version  of  the  ERA  model  is  u(.B),  the  measure  of  the 
aspects  that  belong  to  all  alternatives  of  R,  and  do  not  belong  to  any  alter- 
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The  individual  version  of  the  EBA  model  assumes  that  at  any  point  in 
time  one  has  a  fixed  ordering  of  the  relevant  aspect-sets  which,  in  turn, 
induces  a  (lexicographic)  ordering  of  the  available  alternatives.  However, 
at  a  different  point  in  time,  one  may  be  in  a  different  state  of  mind  which 
yields  different  ordering  of  aspects  and  alternatives.  Indeed,  the 
stochastic  component  was  introduced  into  the  model  to  accommodate  such  momentary 
fluctuations.  The  new  aggregate  version  of  EBA  assumes  that  each 
individual  has  a  fixed  ordering  of  the  relevant  aspect-sets,  and  the 
stochastic  component  of  the  model  is  associated  with  differences  between 
individuals  rather  than  with  changes  within  an  individual.  Hence,  the  former 
version  explains  choice  probabilities  in  terms  of  an  intra-individual  distribution 
of  states  of  mind,  whereas  the  latter  version  explains  the  data  in  terms  of  an 
inter-individual  distribution  of  tastes. 

The  EBA  model  may  provide  a  useful  model  of  aggregate  data  because  the  same 
principles  that  give  rise  to  EBA  as  a  model  of  individual  choice  appear  to 
apply  to  group  data.  As  a  case  in  point,  let  us  reexamine  the  similarity 
effect  using  the  transportation  problem  discussed  earlier.  Suppose  the 
group  is  divided  equally  between  the  train  t^  and  the  plane  a^,  and  is  also 
equally  divided  between  the  two  airlines  a^  and  Hence, 

P(ti,ai)  -  p(ai*a2)  -  1/2 

We  propose  that  the  proportion  of  individuals  who  choose  the  train  tj,  from 
the  offered  set  {t^,a^,a2l  lies  between  1/2  and  1/3  because  the  addition  of 
aj  to  {t^,a^}is  likely  to  affect  those  who  chose  a^  more  than  those  who  chose  t^. 
More  generally,  the  addition  of  a  new  alternative  or  product  (e.g.,  a  low-tar 
cigarette  or  a  liberal  candidate)hurts  similar  alternatives  (e.g.,  other  low-tar 
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cigarettes,  and  liberal  candidates)  more  than  less  similar  alternatives. 

Furthermore,  as  in  the  case  of  individual  choice,  the  similarity 
between  options  appears  to  enhance  the  discrimination  between  them. 

Suppose  that  each  individual  prefers  train  t2  over  train  t^  since  it  is 
slightly  faster.  Suppose  further  that  the  group  is  equally  divided 
between  and  t^,  so  that  P(a^,t^)  ■  1/2.  Contrary  to  CRM  which  implies 
P(t2,ai>  ■  1,  we  predict  that  P(t2,a^)  is  likely  to  be  between  1/2  and  1 
because  many  of  those  who  prefer  a^  over  t^  are  not  likely  to  switch  from 
a  plane  to  a  train  because  of  the  slight,  albeit  clear,  advantage  of  the 
faster  train.  Since  the  same  correlational  pattern  emerges  from  both 
individual  and  group  data,  the  F.BA  model  may  be  applicable  to  both,  although 
the  assumptions  and  the  parameters  of  the  model  have  different  interpretations 
in  the  two  cases. 

Consider,  for  example,  the  assumption  that  the  alternative  set 
T  "{a^^.t^)  in  the  transportation  problem  has  a  tree  structure  (a^  82)^. 

In  the  individual  version,  the  tree  assumption  implies  that  any  aspect  that 
is  shared  by  the  train  and  any  one  of  the  airlines  is  also  shared  by  the 
other  airline.  In  the  aggregate  case,  the  tree  assumption  entails  that  both 
QT(a1,tl)  and  Q^fa^.t^)  vanish,  that  is,  nobody  eliminates  from  T  one  airline 
only.  Hence,  if  all  individuals  share  the  same  tree  structure  but  not  necessaril 
the  same  preferences,  the  aggregate  data  will  generally  exhibit  the  same 
qualitative  structure.  The  actual  measure,  derived  from  aggregate  data  however, 
does  not  relate  to  the  measures  derived  from  Individual  data  In  any  simple  manner 
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In  this  section  we  apply  the  tree  model  to  several  sets  of  individual 
and  aggregate  choice  probabilities  reported  in  the  literature,  construct 
tree  representations  for  these  data  and  test  Pretree  against  CRM.  As  was 
demonstrated  in  the  previous  section,  the  trinary  and  the  quarternary 
conditions  provide  necessary  and  sufficient  conditions  for  the  representation 
of  binary  choice  probabilities  as  a  preference  tree.  For  error-free  data, 
therefore,  these  conditions  can  be  readily  applied  to  find  a  tree  structure 
that  is  compatible  with  the  data.  Since  data  are  fallible,  however,  the 
construction  of  the  most  appropriate  cree  structure,  the  estimation  of 
link-lengths  and  the  evaluation  of  the  adequacy  of  the  tree  model,  pose 
non-trivial  computational  and  statistical  problems. 

In  the  present  paper,  we  do  not  develop  a  comprehensive  solution  to 
the  construction,  estimation,  and  evaluation  problems.  Instead,  we  rely 
on  Independent  judgments  (e.g.,  similarity  data)  for  the  construction  of 
the  tree,  and  emplov  standard  Iterative  maximization  methods  to  estimate 
its  parameters.  To  evaluate  goodness-of-f it  we  test  the  tree  model 
assuming  the  hypothesized  tree  structure,  against  the  binarv  version  of 
Luce's  constant-ratio  model. 

It  has  been  shown  by  Luce  (1959)  that  the  binary  CRM,  according  to 
which  P(x,y)  ■  v(x) / (v(x)+v(y) ) , is  essentially  equivalent  to  the  following 
product  rule 

(11)  P(x,y)P(y,z)P(z,x)  -  P(x,z)P(z,y)P(y,x) ,  i.e.,  R(x,v)R(y,z)R(j^zJ)  -  1 
Thus,  any  two  intransitive  cycles  through  the  same  set  of  alternatives 
are  equiprobable.  On  the  ocher  hand,  the  trinarv  condition  (4)  vields 
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(12)  If  P(x,y)  >  S  and  (xy)z  then  R(x,y)R(y,z)R(x|,z))  >  1, 
or  P(x,y)P(y,z)P(z,x)  >  P(x,z)P(z,y)P(y,x) . 

Any  hypothesized  tree  structure,  therefore,  can  be  examined  to  test 
whether  the  product  rule  is  violated  in  the  predicted  direction. 

The  analysis  of  the  data  proceeds  as  follows.  We  start  with  a  given 
set  of  individual  or  collective  pair  comparison  data  along  with  a  hypothesized 
tree  structure,  derived  from  a  priori  considerations  or  inferred  from 
other  data.  Maximum  likelihood  estimates  for  both  CRM  and  Pretree  are  obtained 
using  Chandler's  (1969)  iterative  program  (STEPIT) ,  and  the  two  models  are 
compared  via  a  likelihood  ratio  test.  In  addition,  we  perform  an  estimate- 
free  comparison  of  the  two  models,  by  contrasting  the  product  rule(ll)  and 
the  trinary  inequality  (12). 

Choice  between  Celebrities 

Rumelhart  and  Greeno  (1971)  investigated  the  effects  of  similarity  on 
choice  probability,  and  compared  the  choice  models  of  Luce  (1959)  and  Restle 
(1961).  The  stimuli  were  9  celebrities  including  three  politicians  (L.  B. 
Johnson,  Harold  Wilson,  Charles  DeGaulle),  three  athletes  (Johnny  Unitas,  Carl 
Yastrzemski,  A.  J.  Foyt),  and  three  movie  stars  (Brigitte  Bardot,  Elizabeth 
Taylor,  Sophia  Loren).  The  subjects  (N*234)  were  presented  with  all  36  pairs 
of  names  and  were  instructed  to  choose  for  each  pair  "the  person  with  whom 

they  would  rather  spend  an  hour  discussing  a  topic  of  their  choosing". 

2 

On  the  basis  of  a  x  test  for  goodness-of-f it ,  applied  to  the  aggregate 
choice  probabilities,  Rumelhart  and  Greeno  (1971)  were  able  to  reject 
Luce's  model  (x‘:(28)  •  78.2  ,  p  <  .001)  but  not  a  particular  version  of 
Restle's  model  (x2(19)  ■  21.9  ,  p  >.25).  Recall  that  Restle's  model  coincides 
with  the  binary  form  of  the  EBA  model. 
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The  list  of  celebrities  used  in  this  study  naturally  suggests  the  foll¬ 
owing  tree  structure  with  three  branches  corresponding  to  the  three  different 

occupations  represented  in  the  list:  (LBJ,  HW,  CDG)  (JU,  CY,  AJF)  (BB,  ET.SL). 

2 

The  estimates  of  the  parameters  of  the  tree,  displayed  in  Figure  7,  are 

identical  to  those  obtained  by  Edgell,  Geisler  and  Zinnes  (1973),  who 

\ 

corrected  the  procedure  used  by  Rumelhart  and  Greeno  (1971)  and  proposed  a 
simplification  of  the  model  which  amounts  to  the  above  tree  structure.  The 
tree  model  appears  to  fit  the  data  quite  well  (x2 (25 )  *  30.0  ,  p  >.20), 
although  it  has  only  three  more  parameters  than  Luce's  model. 


Insert  Figure  7  here 

Since  Pretree  includes  CRM,  the  likelihood-ratio  test  can  be  used  to 
test  and  compare  them.  The  test  is  based  on  the  fact  that  if  Model  1  is 
valid  and  includes  Model  2  then,  under  the  standard  assumptions,  -2 ^(L^/Lo) 
has  a  x2  distribution  with  d^-d2  degrees  of  freedom,  where  and  Lt  denote 
the  likelihood  functions  of  models  1  and  2,  while  d^  and  d^  denote  the 
respective  numbers  of  parameters.  If  the  inclusive  model  is  saturated,  i.e., 
imposes  no  constraints,  then  the  above  test  is  equivalent  to  the  common  x2 
test  for  goodness  of  fit.  When  the  likelihood -ratio  test  is  applied  to  the 
present  data,  CRM  is  rejected  in  favor  of  Pretree,  xM3)  *  48.2  ,p  <  .001. 

The  average  absolute  deviation  between  predicted  and  observed  probabilities 
is  .036  for  CRM  and  .023  for  Pretree. 

It  should  be  noted  (see  Falmagne,  Reference  Note  1,  1979)  that  the 
test  statistics  for  Pretree  does  not  have  an  exact  x2  distribution  because 
the  parameter  space  associated  with  the  model  is  constrained  not  onlv  bv 


i- 
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Che  equations  implied  by  Che  quarternary  conditions,  but  also  by  the 
trinary  inequality.  The  result,  however,  is  a  stricter  test  of  Pretree 
since  the  Inequalities  imposed  on  the  solution  can  only  reduce  goodness  of 
fit. 

Since  the  product  rule  (11)  and  the  trinary  inequality  (12)  are  the 
key  binary  properties  that  give  rise,  respectively,  to  CRM  and  Pretree,  it 
is  instructive  to  compare  them  directly.  Using  the  tree  structure  presented 
in  Figure  7,  the  trinary  inequality  applies  in  9  x  6  -  54  triples  and  it  is 
satisfied  in  39%  of  the  cases.  Because  the  various  triples  are  not  indepen¬ 
dent,  no  simple  statistical  test  is  readily  available.  To  obtain  some 
indication  about  the  size  of  Che  effect,  we  computed  the  value  of  R(xyz)“ 
R(x.y)R(y,i)R(z,x)  for  all  triples  satisfying  (xy)z  and  R(x,y)  >  1.  The 
median  of  these  values  equals  1.40,  and  the  interquartile  range  is  (1.13,  1.68). 
Recall  that  under  CRM  the  trinary  inequality  is  expected  to  hold  in  50%  of 
the  cases,  and  the  median  R(xyz)  should  equal  one.  The  summary  statistics 
for  all  the  studies  in  this  section,  are  presented  in  Table  l. 

Political  Choice 

The  next  three  data  sets  were  obtained  from  Lennart  Sjoberg.who  collected 
both  similarity  and  preference  data  for  several  sets  of  stimuli,  and  showed 
a  positive  correlation  between  interstimulus  distances  (derived  from  multi¬ 
dimensional  scaling)  and  the  standard  deviation  of  utility  differences 
(derived  from  a  Thurstonian  model).  Sjoberg(1977)  and  SJ’oberg  and  Capozza  (1975) 
conducted  two  parallel  studies  of  preferences  for  Swedish  and  Italian 
political  parties.  In  these  experiments,  215  Swedish  students  and  195 
Italian  students  were  presented  with  all  pairs  of  the  seven  leading  Swedish 
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and  Italian  parties  .respectively.  The  subjects  first  rated  the  similarity 
between  all  21  pairs  of  parties  on  a  scale  from  1  to  9,  and  then  indicated 
for  each  pair  which  party  they  prefer.  In  addition,  the  subjects  were 
presented  with  all  35  triples  of  parties  and  asked  to  choose  one  party  from 
each  triple. 

The  average  similarities  between  the  parties  were  first  used  to 
construct  an  additive  similarity  tree  according  to  the  ADDTREE  method 
developed  by  Sattath  and  Tversky  (1977).  In  this  construction,  which 
generalizes  the  familiar  hierarchical  clustering  scheme,  the  stimuli  are 
represented  as  terminal  nodes  in  a  tree  so  that  the  dissimilarity  between 
stimuli  corresponds  to  the  length  of  the  path  that  joins  them.  For  illustration, 
we  present  in  Figure  8  the  additive  tree  (ADDTREE)  solution  for  the  sim¬ 
ilarities  between  the  Swedish  parties.  The  product-moment  correlation  be¬ 
tween  rated  similarities  and  path-length  is  -.96.  Assuming  the  tree  structure 
derived  from  ADDTREE,  Chandler's  (1969)  STEPIT  program  was  employed  to 
search  for  maximum  likelihood  estimates  of  the  parameters  of  Pretree--using 
the  observed  choice  probabilities.  The  obtained  preference  tree  for  the 
Swedish  data  is  presented  in  Figures  9,  and  the  preference  tree  for  the 
Italian  data  is  presented  in  Figure  10. 


Insert  Figures  8,  9,  iO  here 


Several  comments  about  the  relations  between  similarity  and  preference 
trees  are  in  order.  First,  the  rules  for  computing  dissimilarity  and  preference 
from  a  given  tree  are  quite  different.  The  dissimilaritv  between  x  and  y 
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is  represented  by  the  length  of  the  path  (i.e.,  the  sum  of  the  links) 
that  connects  x  and  y,  while  the  degree  of  preference  R(x,y)  is  rep¬ 
resented  by  the  ratio  of  the  respective  paths.  Second,  the  numerical 
estimates  of  the  links  in  the  two  representations  tend  to  differ 
systematically.  In  general,  the  distances  between  the  root  and  the 
terminal  nodes  vary  much  more  in  a  preference  tree  (due  to  the  presence 
of  extreme  choice  probabilities)  than  in  a  similarity  tree.  Furthermore,  some 
links  that  appear  in  the  similarity  tree  sometimes  vanish  in  the  estimation 
of  Pretree  (as  can  be  seen  by  comparing  Figures  8  and  9)  indicating  the 
presence  of  aspects  that  affect  judged  similarity,  but  not  choice  probability. 
Third,  the  root  in  a  similarity  tree  is  essentially  arbitrary  since  the 
distance  between  nodes  is  unaffected  by  the  choice  of  root.  The  probability 
of  choice  in  Pretree,  however,  is  highly  sensitive  to  the  choice  of  a  root. 
Consequently,  several  alternative  roots  were  tried  and  the  best-fitting 
structure  was  selected  in  each  case. 

Tests  of  goodness  of  fit  indicate  that  Pretree  provides  an  excellent 
account  of  the  Swedish  data  x2(ll)  *  5.8,  p  >  .5,  with  an  average  absolute 
deviation  of  .012,  compared  with  ( 15)  ■  49.1,  p  <  .001,  with  an  average 

absolute  deviation  ot  .038  for  CRM.  Pretree  also  provides  a  reasonable 
account  of  the  Italian  data  x2(ll)  =  19.5,  p  >  .05,  with  an  average  absolute 
deviation  of  .023,  compared  with  x 2 (15)  »  67.6,  p  <  .001,  with  an  average 
absolute  deviation  of  .042  for  CRM.  The  applications  of  the  likelihood  ratio 
test  indicate  that  Pretree  fits  these  data  significantly  better  than  CRM; 
tne  test  statistics  are  x2(4)  *  43.3,  p  <  .001,  for  the  Swedish  data  and 
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X2(4)  ■  43.1,  p  <  .001,  for  the  Italian  data.  Furthermore,  for  the  Swedish 
data,  the  trinary  inequality  is  satisfied  in  96 %  of  the  cases  (N  *  23),  the 
median  R(xyz)  equals  1.73,  and  the  interquartile  range  is  (1.38,  2.27). 

For  the  Italian  data,  the  trinary  inequality  is  satisfied  in  78%  of  the 
cases  (N  *  18),  the  median  R(xyz)  equals  1.74,  and  the  interquartile  range 
is  (.93,  2.78). 

The  availability  of  both  binary  and  trinary  probabilities  in  the  political 
studies  permitted  an  additional  test  of  Pretree.  Recall  from  (7)  that  the 
tree  model  implies 

P(x,z)  P(x,xyz)  . ,  ,  ,  . 

-  >  -  provided  (xy)z, 

P (z , x)  P(z,xyz) 

while  CRM  implies  that  the  two  ratios  are  equal.  For  the  Swedish  data,  the 
above  inequality  is  satisfied  in  87%  of  the  cases  (N  =  46),  the  median 
P(x,z)P(z,xyz)/P(z,x)P(x,xyz)  equals  1.28,  and  the  interquartile  range  is 
(1.12,  1.64).  For  the  Italian  data,  the  inequality  is  satisfied  in  81%  of  the 
cases  (N  »  36),  the  median  of  the  above  product  ratio  equals  1.19,  and  the 
interquartile  range  is  (.86,  2.28).  Note  that  under  CRM 

P(x,z)P(z,xyz) /P(z,x)P(x,xyz)  =  u(x)u(z) /u(z)u(x)  =  I. 

Choice  between  Academic  Disciplines 

In  a  third  study  conducted  by  Sjoberg  (1977),  the  alternatives  con¬ 
sisted  of  the  following  twelve  academic  disciplines  that  comprise  the  social 
science  program  at  the  University  of  Goteborg:  Psychology,  Education. 
Sociology,  Anthropology,  Geography,  Political  Science,  Law,  Economic  Historv 
Economics,  Business  Administration,  Statistics,  Computer  Science.  A  group 
of  85  students  from  that  university  first  rated  the  similarity  between 
all  pairs  of  disciplines  on  a  9  point  scale,  and  then  indicated  for  each 
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of  Che  66  pairs  the  discipline  they  prefer. 

As  in  the  two  preceding  analyses,  the  tree  structure  was  obtained  via 
ADDTREE ,  and  STEPIT  was  employed  to  search  for  maximum  likelihood  estimates 
of  the  parameters.  The  resulting  preference  tree  for  the  choice  between 
the  twelve  social  sciences  is  presented  in  Figure  11. 


Insert  Figure  11  here 

A  x~  test  for  goodness  of  fit  yields  x2(50)  -  45.5,  p  >  .25  for  Pretree, 
compared  with  x 2  ( 55 )  «  69.1,  p  >  .05,  for  CRM,  and  the  likelihood  ratio 
test  rejects  CRM  in  favor  of  Pretree,  x 2  C5 )  «  23.6,  p  <  .001.  The  average 
absolute  deviation  between  predicted  and  observed  probabilities  is  .025 
for  Preetree  and  .035  for  CRM.  Finally,  the  trinarv  inequality  is  satisfied 
in  34%  of  the  cases  (N  =  36),  the  median  R(xvt)  equals  1.52,  and  the  inter¬ 
quartile  range  is  (1.21,  1.86). 

Choice  Between  Shades  of  Gray 

In  a  classic  study  of  unfolding  theory,  Coombs  (1953)  used  as  stimuli 

12  patches  of  grey  that  vary  in  brightness.  The  subjects  were  presented 

with  all  possible  sets  of  4  stimuli,  and  were  asked  to  rank  them  from  the 

most  to  the  least  representative  grey.  Binary  choice  probabilities  were 

estimated  for  each  subject  by  the  proportion  of  rank-orders  in  which  one 

stimulus  was  ranked  above  the  other.  The  data  provided  strong  support  for 

Coomb's  probabilistic  unfolding  model  in  which  the  stimuli  are  represented 
as  random  variables,  and  the  derived  choice  probabilities  reflect  momentary 

fluctuations  in  one's  perceptions  of  the  stimuli  as  well  as  in  one's  notion 


i 


of  the  ideal  gray. 
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Insert  Figure  12  here 


To  represent  Coombs'  data  as  a  tree,  consider  a  line  representing 
variation  in  brightness  (with  white  and  black  at  the  two  endpoints)  that  is 
folded  in  the  middle  at  a  point  corresponding  to  the  prototypical  gray.  The 
stimuli  can  now  be  represented  as  small  branches  stemming  from  this  folded  line, 
see  Figure  12.  Because  of  the  large  number  of  zeros  and  ones  in  these  data, we 
did  not  attempt  to  estimate  the  tree.  Instead,  we  inferred  the  characteristic 
folding  point  of  each  subject  from  the  data  and  used  the  induced  tree  structure 
to  compare,  separately  for  each  subject,  the  tr inary  inequality  against  the 
product  rule,  letting  P(x,y)  denote  the  probability  that  x  is  judged  to  be  farther 
than  y  from  the  prototypical  gray.  Triples  involving  zero  probability  were 
excluded  from  the  analysis.  The  results  for  each  one  of  the  four  subjects, 
presented  in  the  bottom  part  of  Table  1,  show  that  the  product  rule  (11) 
is  violated  in  the  manner  implied  by  the  trinarv  ineoualitv  (12) 


Insert  Table  1  here 


Table  1  summarizes  the  analvses  of  the  studies  discussed  in  this  section. 
The  left-hand  part  of  the  table  describes  the  statistics  for  the  trinary 
ineauality,  where  N  is  the  number  of  tested  triples,  *  is  the  percentage 
of  triples  that  confirm  the  trinarv  inequalitv,  R  is  the  median  value  of 
R(xyz)  -  R(x,y)R(v,z)R(z,x),  while  Rj  and  R^  are  the  first  and  third  ouartiles 
of  the  distribution  of  -R(xvz).  The  right-hand  part  of  Table  1  describes 
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the  measures  of  goodness  of  fit  for  both  CRM  and  Pretree, where  d  is  the 
average  absolute  deviation  between  observed  and  predicted  choice  probabilities. 
Tree  Representation  of  Choice  Data 

The  examination  of  the  trinary  inequality  provides  an  estimate- 
free  comparison  of  CRM  and  Pretree.  The  results  described  in  Table  1  show 
that,  in  all  data  sets,  CRM  is  violated  in  the  direction  implied  by  the 
similarity  hypothesis  and  the  assumed  tree  structure.  The  statistical 
tests  for  the  correspondence  between  models  and  data  indicate  that 
Pretree  offers  an  adequate  account  of  the  data  that  is  significantly  better 
than  the  account  offered  by  CRM.  Apparently,  the  introduction  of  a  few 
additional  parameters,  that  correspond  to  aspects  shared  bv  some  of  the 
alternatives,  results  in  a  substan  ial  improvement  in  goodness  of  fit. 
Furthermore,  Pretree  yields  interpretable  hierarchical  representations  of 
the  alternatives  under  study  along  with  the  measures  of  the  relevant 
aspect  sets. 

The  preceding  analyses  relied  on  similarity  data  or  on  p.  lofi'd  asuAder- 
ations  to  construct  the  tree  structure,  and  used  choice  probabilities  to 
test  the  model  and  to  estimate  the  tree.  This  procedure  avoids  the  diffi¬ 
culty  involved  in  using  the  same  data  for  constructing  the  tree  and  for 
testing  its  validity.  It  is  also  attractive  because  similarity  data  are 
easily  obtained,  and  because  they  are  typically  more  stable  and  less  variable 
than  preferences.  An  examination  of  Sjdberg's  data,  for  example,  shows  that 
subjects  who  reveal  markedly  different  preferences  tend,  nevertheless,  to 
exhibit  considerable  agreement  in  judgments  of  similarity.  The  only 
drawback  of  this  procedure  is  that  it  fails  to  produce  the  best  tree 
whenever  the  similarities  and  the  preferences  follow  different  structures.  The 
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development  of  an  effective  algorithm  for  constructing  a  tree  from  fallible 
preferences  and  the  development  of  appropriate  estimation  and  testing  pro¬ 
cedures  remain  open  problems  for  future  research. 

The  correspondence  between  the  observed  and  the  predicted  choice  prob¬ 
abilities  indicate  that  the  tree  structures  Inferred  from  judgments  of 
similarity  generally  agree  with  the  structures  Implied  by  the  observed 
choice  probabilities.  This  result  supports  the  notion  of  correspondence 
between  similarity  and  preference  structures,  originated  by  Coombs  (1°64), 
and  underscores  the  potential  use  of  similarity  scaling  techniaues  in  the 
analysis  of  choice  behavior.  Other  analyses  of  the  relations  between  the 
representations  of  similarity  and  of  preference,  based  on  multidimensional 
s.  aling,  are  reported  in  Carroll  (1972),  Nvgren  and  Jones  (1977). 


Sjoberg  (1977)  and  Stefflre  (1972). 
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CONSTRAINED  CHOICE  AND  THE  EFFECT  OF  AtlENDA 

The  preceding  development,  like  other  models  of  choice,  deals 
with  the  selection  of  a  single  element  from  some  offered  set.  The 
present  section  investigates  choice  that  is  constrained  by  a  partition 
Imposed  on  the  offered  set.  For  example,  the  choice  of  an  alternative 
from  the  set  ix.v.v.wl  can  be  constrained  hv  the  requirement  to  choose 
first  between  ix,v}  andtv.w!  and  then  to  choose  a  single  element  from  the 
selected  pair.  Constraints  of  this  tvpe  are  quite  common :  thev  could  he 
Imposed  hv  others,  induced  by  circumstances, or  adopted  for  convenience. 

For  example,  the  decision  regarding  a  new  appointment  is  sometimes 
Introduced  as  an  initial  decision  between  a  senior  or  a  lunior  appointment, 
followed  bv  a  later  choice  among  the  respective  junior  or  senior  candidates. 
Deadlines  and  other  time  limits  provide  another  source  of  constraint. 

Suppose  the  al  ternat  Ives  of  ACT,  for  example,  are  no  longer  available 
after  April  1st.  Prior  to  this  date,  therefore,  one  has  to  decide  whether 
to  choose  an  element  of  A,  or  to  select  an  element  from  1'  -  A, in  which  case 
the  choice  of  a  particular  element  can  be  delaved.  The  selection  ot  an 
agenda  and  the  grouping  of  options  for  voting  which  have  long  been 

recognized  as  Influential  procedures')  are  familiar  examples  ot  external 
constra int s . 

There  are  manv  situations,  however,  in  which  a  person  constrains 
his  choice  to  reduce  cost  or  effort.  Consider,  for  example,  a  consumer  who 
intends  to  purchase  one  item  from  a  set  ix,v,v,w'  of  ■>  competing  products. 
Suppose  there  are  two  stores  in  town  that  are  quite  distant  from  each  other; 
one  store  carries  onlv  \  and  v,  while  the  other  carries  on 1 v  v  and  w.  Cndet 
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such  circumstances,  the  consumer  is  likelv  to  select  first  a  store  and  then 
a  product,  because  he  has  to  decide  which  store  to  enter  but  he  does  not  have 
to  choose  a  product  before  entering  the  store.  Similarly,  people  typically 
select  a  restaurant  first  and  an  entree  later  —  even  when  they  are 
thoroughly  familiar  with  the  available  menus.  Thus,  the  need  to  make  some 
decisions  (e.g.,  of  a  restaurant)at  an  earlv  stage  and  the  common  tendency 
to  delay  decisions  (e.g.,  of  an  entree)  to  a  later  stage  constrain  the 
sequence  of  choices  leading  to  the  selected  alternative. 

The  effect  of  an  agenda  on  group  decision  making  has  been  investigated 
by  an  economist,  Charles  R.  Plott,  and  a  lawyer,  Michael  E.  Levine,  from 
Caltech.  Levine  and  Plott  (1977)  conducted  an  ingenious  study  of  a  flying 
club,  to  which  they  belong,  whose  members  had  to  decide  on  the  size  and 
composition  of  the  club's  aircraft  fleet.  There  were  a  few  hundred  competing 
alternatives,  and  the  group  was  to  meet  once  and  decide  bv  a  majoritv  vote. 
Levine  and  Plott  constructed  an  agenda  designed  to  maximize  the  chances 
of  selecting  the  alternative  thev  preferred.  The  group  followed  this  agenda, 
and,  indeed,  chose  the  option  favored  hv  the  authors.  A  second  studv 
demonstrated  the  impact  of  agenda  under  controlled  laboratory  conditions. 

Plott  and  Levine  (1978)  developed  a  modet  for  individual  voting  behavior 

and  used  it  to  construct  for  each  alternative  an  agenda  for  the  group, 
designed  to  enhance  the  selection  of  that  alternative.  The  results  indicate 
that,  although  the  specific  model  was  not  fullv  supported,  the  imnosed 
agenda  had  a  substantial  effect  on  group  choice. 
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A  Theoretical  Analysis 

An  agenda  or  a  constraint  imposed  on  an  offered  set  imposes  a  hierarchical 
structure  or  a  tree  on  that  set.  Suppose,  for  example,  that  {B,C,D}  is 
a  partition  of  A;  hence,  under  the  constraint [ B ] [C] ] [D]the  choice  of  an 
alternative  from  A  proceeds  by  first  choosing  between  D  and  BUC  and  then 
choosing  between  B  and  C — if  D  is  eliminated  in  the  first  stage.  It  is 
essential  to  distinguish  here  between  the  intrinsic  tree  structure  (defined 
in  terms  of  the  relations  among  the  aspects  that  characterize  the  alternatives) 
and  the  imposed  structure  that  characterizes  the  external  constraints.  The 
choice  among  {x,v,v,w},  for  example,  whose  aspects  form  the  tree  (xv) (vw) 
may  be  constrained  by  the  requirement  to  choose  first  between  {x,w}  and 
{y,v}.  To  avoid  confusion  we  use  parentheses,  e.g.,  (xy)v,  to  characterize 
the  intrinsic  tree,  and  brackets,  e.g.,  [xv]z,  to  denote  the  imposed 
constraints. 

Let  F(x,  [A]  [B] ) ,  xeA,  Af\  P.  -  $ ,  denote  the  probabilitv  of  selecting 
x  from  AU  B  subject  to  the  constraint  of  choosing  first  between  A  and  B. 

The  present  treatment  is  based  on  the  following  assumption. 

(13)  P(x,  [A]  [B] )  -  P(x,A)P(A,AUB)  -  P(x,A)  Z  P(v,AUB). 

ycA 

That  is,  the  probability  of  choosing  x  under  [ A 1 T B 1  is  decomposable  into  two 
independent  choices:  the  choice  of  x  from  A,  and  the  choice  of  A  from 
[A] [B ] .  Furthermore,  the  latter  choice  is  reduced  to  the  selection  of  anv 
element  of  A  from  the  offered  set  A(JB.  Hence  for  A  -  {x.v}  and  B  -  (v.w), 
P(x, [xy 1 [vw])  -  P(x,v) (P(x,xvvw)  +  P(y,xyvw)).  Equation  (13)  does  nor 
assume  any  choice  model,  it  merely  expresses  the  probabilitv  of  a  constrained 
choice  in  terms  of  the  probabilities  of  non-constrained  choices. 
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A  choice  model  is  called  invariant  if  the  probability  of  choice  is 
unaffected  by  constraints  imposed  on  offered  sets.  Thus,  invariance 
implies  that  P'(x,[A][b])  ■  P(x,  AljB)  for  all  xeAUB.  It  is  easy  to  see 
that  CRM  is  invariant.  In  fact,  the  invariance  condition  is  equivalent  to 
Luce's  (1959)  choice  axiom,  which  asserts  that  P(x,A)  *  P(x,3)  P(B,A) 

whenever  BCA  and  P(x,A)  >  0.  Consequently,  Luce's  model  is  the  only 
invariant  theory  of  choice;  all  other  models  violate  invariance  in  one 
form  or  another! 

Two  hierarchical  structures  or  trees  defined  on  the  same  set  of 
alternatives  are  called  compatible  iff  there  exists  a  third  tree,  defined 
on  the  same  alternatives, which  is  a  refinement  of  both.  Refinement  is 
used  here  in  a  non-strict  sense  so  that  every  tree  is  a  refinement  of  it¬ 
self.  Thus,  ((xy)z)(uvw)  is  compatible  with  (xyz)((uv)w)  because  both 
are  coarsenings  of  ( (xy)z) ( (uv)w)  .  On  the  other  hand,  (xv)z  and  (xz)y  are 
incompatible  since  there  is  no  tree  that  is  a  refinement  of  both.  Note 
that  the  (degenerate)  tree  structure  implied  by  CRM  is  compatible  with  any 
tree.  The  relation  between  the  intrinsic  preference  tree  and  the  imposed 
agenda  is  described  in  the  following  theorem. 

COMPATIBILITY  THEOREM:  If  (13)  holds  and  Pretree  is  valid  then  a  set  of 
choice  probabilities  is  unaffected  jy  constraints  iff  the  constraints  are 
compatible  with  the  structure  of  the  tree. 

A  proof  of  the  theorem  is  given  in  Section  V  of  the  Appendix;  the 
following  discussion  explores  the  simplest  example  of  the  effect  of  agenda. 
Suppose  T*{x,y,z),  Pretree  holds  and  the  intrinsic  tree  is  (xv)z  . 
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Let  a, 3  and  y  denote  the  measures  of  the  unique  aspects  of  x,  y  and  z, 
respectively,  and  let  0  denote  the  measure  of  the  aspects  shared  by  x  and 
y,  see  Figure  5.  Setting  a+df-y+e  “  1*  yields 

P(x,xvz)  ■  a  +  9a  /(ar*6),  P(v,xyz)  -  9+99  /  ) ,  P(z,xvz)  -  y. 

There  are  three  non-trivial  constraints  in  this  case.  The  first, 
txylz,  coincides  with  the  tree  structure,  hence  it  does  not  influence  choice 
probability.  The  other  two  partitions,  [xz]y  and  [vz  ]x,  are  svmmetric  with 
respect  to  x  and  v,  hence  we  investigate  only  the  former.  Bv  (13),  we  have 
P(y,[xz]y)  ■  P(y,xyz).  More  generally,  an  imposed  partition,  e.g.,  [xz  |v, 
does  not  change  the  probability  of  selecting  the  isolated  alternative,  e.g.,  y. 
The  imposed  constraint,  however,  can  have  a  substantial  effect  on  the 
probability  of  selecting  other  alternatives,  e.g.,  x  and  z.  Since 

P(x,[xz]y)  ■  P(x,z)  (P(x,xyz>  +  P(z,xyz)), 

P(x,[xz]y)  ^PCx.xyz)  iff 
P(z,xvz)  P(x,z)  >P(x,xyz)  P(z,x). 

In  the  tree  model,  with  (xy)z  ,this  inequality  is  alwavs  satisfied,  see  Equation 
(7),  because 

P(x,z)  m  a  +6  >  a+9q/  (a->8  )  „  P(x,xvz) 

P(z,x)  y  Y  P(z,xvz)  > 

hence,  P(x, (xz)y)  >  P(x,xyz) .  Imposing  the  partition  [xzly,  therefore,  on 
the  tree  (xy)z  is  beneficial  to  x,  immaterial  for  y,  and  harmful  to  z. 

To  interpret  this  result,  recall  that  x  and  y  share  more  aspects  with 
each  other  than  with  z.  In  the  absence  of  external  constrain: s,  z  benefits 
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directly  from  the  competition  between  x  and  y  —  as  demonstrated  by  the 
above  inequality  which  shows  that  x  loses  proportionally  more  than  z  by  the 
addition  of  y  to  the  set  (x,z)  .  The  constraint  [xz]v  reduces,  in  effect,  the 
direct  competition  between  x  and  v,  and  enhances  x  at  the  expense  of  z. 

A  numerical  example  illustrates  this  effect.  Suppose  a«  .0001, 

3  ■  .0999,  9  ■  .4  and  y-  .5.  In  a  free  choice,  therefore,  P(z,  xyz)  »  .5, 

P(v,xyz)  -  .4995  and  P(x,xyz)  ■  .0005  because  x  is  practically  dominated 
by  y.  Under  the  constraint  [xz]y,  however,  the  probabilities  of  choosing 
z,  y  and  x,  respectively,  are  .2761,  .4995  and  .2244.  Thus,  the  imposed 
partition  increases  the  probability  of  choosing  x  from  .0005  to  .2244 
This  occurs  because  x  fares  well  against  z,  but  performs  badly  against 
y.  In  a  regular  choice  where  x  is  compared  directly  to  v,  its  chances  are 
negligible.  Under  the  partition  [xz]v,  however,  these  chances  improve 
greatly  because  there  is  an  even  chance  to  eliminate  v  in  the  first  stage, 
and  a  close-to-even  chance  to  eliminate  z  in  the  second  stage. 

The  above  treatment  of  constrained  choice  should  be  viewed  as  a 
first  approximation  because  its  assumptions  probably  do  not  always  hold. 
First,  the  alternatives  in  question  mav  not  form  a  tree.  Second,  the 
independence  condition,  embodied  in  (13),  mav  fail  in  manv  situations. 

Finally,  the  probability  of  selecting  A  over  B  mav  not  equal  -  P(x,AU3) 

xeA 

-  particularly  when  A  and  B  have  a  different  number  of  elements  that  could 
induce  a  bias  to  choose  the  larger  or  the  smaller  set.  Nevertheless,  the 
proposed  model  appears  to  provide  a  promising  method  for  the  analysis  of 


constrained  choice. 
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Constrained  Choices  among  Prospects  and  Applicants 

The  present  experiment  investigates  the  effect  of  agenda  on  ind¬ 
ividual  choice,  and  tests  the  implications  of  the  preceding  analysis.  Two 
parallel  studies  are  reported  using  hypothetical  prospects  (Study  I  )  and 
college  applicants  (Study  II)  as  choice  alternatives.  Each  prospect  was 
described  as  p%  chance  to  win  $a  and  (100  -  p)%  chance  to  win  nothing, 
denoted  ($a,p%).  Each  applicant  was  characterized  by  a  high  school  grade 
point  average  (GPA)  and  an  average  score  on  the  Scholastic  Achievement 
Test  (SAT).  The  subjects  were  reminded  that  the  SAT  has  a  maximum  of  800  with 
an  average  of  about  500,  and  that  GPA  is  computed  by  letting  A  *  4, 

B  »  3,  etc. 

One  hundred  students  from  Stanford  University  participated  in  each 
of  the  two  studies.  Every  subject  was  presented  individually  with  10  triples 
of  alternatives,  each  displayed  on  a  separate  card.  Each  triple  was  divided 
into  a  pair  of  alternatives  and  an  odd  alternative,  and  the  subject  was 
instructed  to  decide  first  whether  he  or  she  preferred  the  odd  alternative 
of  one  of  the  members  of  the  pair.  If  the  odd  alternative  was  selected, 

the  elements  of  the  triple  were  not  considered  again.  If  the  pair  was 
selected,  the  subject  was  given  an  opportunity  to  choose  between  its 
members  after  the  presentation  of  all  ten  triples.  The  delay  was  designed 
to  reduce  the  dependence  between  the  trinarv  and  the  binary  choices. 

The  subjects  in  Study  I  were  asked  to  imagine  that  they  were  actually 


faced  with  the  choice  between  the  displayed  prospects,  and  to  indicate 
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the  decision  they  would  have  made  in  each  case.  The  subjects  in  Study  II 
were  asked  to  select,  from  each  triple,  the  applicant  that  thev  preferred. 
Subjects  were  reminded  that  their  task  was  to  express  their  preferences 
rather  than  predict  which  applicant  was  most  likely  to  be  admitted  to  college. 
The  participants  in  both  studies  were  asked  to  consider  each  choice  carefully 
and  to  treat  each  triple  as  a  separate  choice  problem. 

The  alternatives  in  each  triple,  denoted  x,y,z,  were  constructed  so  that 
(i)  x  and  y  are  very  similar,  (ii)  z  is  not  very  similar  to  either  x  or  y, 
(iii)  the  advantage  of  y  over  x  on  one  dimension  appears  greater  than  the 
advantage  of  x  over  y  on  the  other  dimension,  so  that  y  is  preferable  to  x. 

In  Study  I,  z  is  a  sure  prospect  while  x  and  y  are  risky  prospects  with 
similar  probabilities  and  outcomes,  and  with  y  superior  to  x  in  expected  value. 
For  example,  x  ■  ($40,  75%) ,  y  “  ($50,  70%)  and  z  is  $25  for  sure,  denoted 
($25).  In  Study  II,  x  and  y  are  applicants  with  relatively  high  GPA  and 
moderate  SAT,  while  z  is  an  applicant  with  a  relatively  low  GPA  and  fairly 
high  SAT.  For  example,  x  -  (3.5,  562),  y  -  (3.4,  596)  and  z  -  (2.5,  725). 

The  results  of  a  pilot  study  indicated  that  one-tenth  of  a  point  on  the  GPA 
scale  is  roughly  equivalent  to  twenty  SAT  points.  According  to  this  criterion 
for  overall  quality,  applicant  y  is 'better'  than  x  in  all  cases.  All  triples 
of  prospects  and  applicants  are  displayed  in  Table  2. 

The  present  experiment  was  designed  to  compare  choice  under  [xy]z  with 
choice  under  [xz]y.  Hence,  for  each  triple,  one-half  of  the  subjects  had 
to  choose  first  between  the  pair  (x,y)  and  z,  while  the  remaining  one-half 
had  to  choose  first  between  the  pair  (x,z)  and  y.  Each  subject  made  five 
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choices  under  [xy]z  and  five  choices  under  [xz]y.  The  order  of  triples  and 

constraints,  as  well  as  the  positions  of  the  option  cards  (i.e.,  left,  center 
right)  were  all  counterbalanced. 

Because  alternatives  x  and  y  have  much  more  in  common  with  each  other 
than  with  z,  the  tree  structure  that  best  approximates  the  triples  is  (xy)z. 
Hence,  the  constraint  [xy]z  is  compatible  with  the  natural  structure  of  the 
alternatives,  while  the  constraint  [xz]y  is  not.  The  preceding  analysis 
implies  that  the  latter  should  enhance  the  choice  of  x,  hinder  the  choice  of 
2,  and  have  no  substantial  effect  on  the  choice  of  y.  Stated  formally, 

d(x)  -  P(x,[xz]y)  -  P(x,[xy]z)  >  0 
d(y)  -  P(y,[xz]y)  -  P(y,[xy]z)  -0 
d(z)  =  P(z,[xz]y)  -  P(z,[xy]z)  <  0 

Obviously,  in  the  absence  of  any  effect  due  to  the  imposed  constraints  d(x) 
d(y)  “  d(z)  *  0.  The  proportions  of  subjects  that  chose  x  and  y  in  each 
triple  under  the  two  constraints  are  presented  in  Table  2,  along  with  the 
values  of  d(x),  d(y)  and  d(z)  defined  above. 


Insert  Table  2  here 
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The  results  reported  in  Table  2  tend  to  confirm  the  predicted 
pattern  of  choices.  In  both  studies  the  values  of  d(x)  are  all  positive 
while  the  values  of  d(z)  are  negative  with  a  few  small  exceptions. 
Furthermore,  in  both  Studies  I  and  II  the  means  of  d(x)  are  significantly 
positive  .yielding  t(9)  *  9.2  and  t (9)  «  8.6,  respectively,  p<  .001,  while 
the  means  of  d(z)  are  significantly  negative,  yielding  t (9)  -  -3.0,  p<  .05, 
in  Study  I,  and  t(9)  *  5.5,  p<  .001  in  Study  II.  The  means  of  d(v)  were 
also  negative .yielding  t(9)  = -2.  3  and  t(9)  -  -2. 8, respectively,  .01 <  p <  .05. 
Hence,  the  shift  from  the  natural  constraint  txvlz  to  the  constraint  [xzly 
increases  the  chances  of  x  and  decreases  the  chances  of  z  and,  to  a 
lesser  extent,  of  y.  The  latter  effect, which  departs  from  the  predicted 
pattern, may  reflect  a  response  bias  against  the  odd  alternative. 

The  pattern  of  results  described  in  Table  2  seems  to  exclude  two 
alternative  simple  models  that  produce  an  agenda  effect.  Suppose  choices 
are  made  at  random  so  that  one  chooses  between  the  odd  and  the  paired 
alternatives  with  equal  probability.  As  a  consequence, 

d  (x)  =  P(x,[xz]y)  -  P(x,  [xy]  z)  =  1s*Js-5ixls  =  0 
d(y)  -  P(y,[xz]y)  -  P(y,[xy]z)  =  h-H  *  4  -  *<>0,  and 
d(z)  =>  P(z,[xz]y)  -  P(z,[xy]z)  «  h  *  *5-^5  •=  -*s  <  0 
which  are  incompatible  with  the  experimental  findings. 

The  random  choice  model  gives  a  distinct  advantage  to  the  odd  alternative, 
hence  its  failure  suggests  a  different  model  according  to  which  the  odd 
alternative  suffers  a  setback,  perhaps  because  people  prefer  to  delay  the 
choice  and  avoid  commitment.  This  hvpothesis,  however,  implies 
d(x)  »  0 ,  d  ( y )  <  0 ,  and  d(z)  >  0  -  -  again  contrary  to  the  data. 
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Since  all  triples  have  the  same  structure,  it  is  possible  to  pool 
all  x-choices,  y-choices  and  z-choices  across  triples  and  test  our 
hypotheses  within  the  data  of  each  subject.  Let  P^Cx.fxzly)  denote  the 
proportion  of  triples  in  which  subject  i  made  an  x-choice  under  the  constraint 
[xz]v,  etc.  Let  dt(x)  -  Pjfx.txzly)  -  P^x.txvjz),  di(z)  -  P(z,[xzly)  - 
P^tz.lxylz),  and  let  ■  d^x)  -  di(z).  Thus,  measures  the  advantage 
of  x  over  z  due  to  the  shift  from  [xylz  to  [xzly.  Recall  that,  in  the 
absence  of  an  agenda  effect  d^(x)  ■  d^(z)  ■  ■  0,  while  under  the  proposed 

model  d^(x)  >  0  >  d^(z)  and  hence  has  a  positive  expectation.  The  means 
of  the  ^distributions  are  .21  in  Study  I  and  .25  in  Study  II,  which  are 
significantly  positive  .yielding  t(99)  *  4.2,  and  t(99)  -  5.8,  respectively, 
p<  .001  in  both  cases.  In  Study  I,  60%  of  the  D^s  are  positive  and  22% 
negative;  in  the  Study  II,  62%  are  positive  and  18%  negative.  Hence,  the 
predicted  pattern  of  choices  is  also  confirmed  in  a  within-subject  comparison, 
where  choices  are  pooled  over  trials  rather  than  over  subjects. 

In  suraary,  cnc  data  show  that  imposed  constraints  have  a  significant 
impact  on  choice  behavior,  and  that  the  results  confirm  the  major  predictions 
of  the  proposed  model  of  constrained  choice.  The  present  results  about  individual 
choice,  that  are  based  on  the  correlational  pattern  among  the  alternatives,  should 
be  distinguished  from  the  results  of  Plott  and  Levine  (1978)  who  demonstrated  the 
effect  of  agenda  on  the  outcome  of  group  decision  based  on  majority  vote.  An 
agenda  often  introduces  strategic  considerations  that  could  affect  the  outcome 
of  a  voting  process,  even  if  it  does  not  change  the  ordering  of  the  options  for 
any  single  individual,  much  as  group  decision  can  be  intransitive  even  when 
its  members  are  all  transitive.  Although  different  effects  seem  to 
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contribute  to  the  failure  of  invariance  in  individual  and  in  collective  choice, 
they  are  probably  both  present,  for  example,  in  many  forms  of  committee 
decision  making.  The  influence  of  procedural  constraints  on  either 
individual  or  social  choice  emerges  as  a  subject  of  great  theoretical  and 
practical  significance.  For  if  the  choice  of  a  new  staff  member,  for  example, 
depends  on  whether  the  initial  decision  concerns  the  nature  of  the  appointment 
(e.g.,  junior  vs.  senior),  or  the  field  (e.g.,  perception  vs  social  ),  then 
the  order  in  which  decisions  are  made  becomes  an  important  component  of  the 
choice  process  that  cannot  be  treated  merely  as  a  procedural  matter. 

The  present  model  of  Individual  choice  under  constraints  may  serve  three 
related  functions.  First,  it  could  be  used  to  predict  the  manner  in  which 
choices  among  political  candidates,  market  products  or  public  policies 
are  affected  by  the  introduction  or  the  change  of  ~'<»ndas.  Second,  the  model 
may  be  used  to  construct  an  agenda  so  as  to  maximize  the  probability  of  a 
desired  outcome.  Experienced  politicians  and  seasoned  marketeers  are  undoubt¬ 
edly  aware  of  the  effects  of  grouping  and  separating  options.  A  formal 
model  may  nevertheless  prove  useful,  particularly  in  complex  decisions 
where  the  number  of  alternatives  is  large  and  computational  demands  exceed 
cognitive  limitations.  Third,  the  model  can  be  employed  by  a  group  or  a 
committee  as  a  framework  for  the  discussion  and  comparison  of  different  agendas. 
Although  an  ’optimal’  or  a  ’fair’  agenda  may  not  exist,  the  analysis  might 
help  clarify  the  issues  and  facilitate  the  choice.  If  all  members  of  the  group, 
for  example,  perceive  the  available  options  in  terms  of  the  same  tree  structure, 
even  though  they  have  different  weights  and  preferences,  then  the  use  of  an 
agenda  that  is  compatible  with  that  structure  is  recommendable  since  it  ensures 
invariance.  The  applications  of  the  present  development  for  the  construction, 
selection, and  evaluation  of  agendas  are  still  left  to  be  developed. 
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Individual  choice  behavior  is  variable,  complex  and  context  dependent, 
and  the  attempts  to  model  it  are,  at  best,  incomplete.  Even  the  most  basic 
axioms  of  preference  are  consistently  violated  under  certain  circumstances, 
see,  e.g.,  Kahneman  and  Tversky  (1979),  Lichtenstein  and  Slovic  (1968), 

Tversky  (1969).  The  present  treatment  does  not  attempt  to  develop  a  compreh¬ 
ensive  theory  of  choice,  but  rather  to  analyze  in  detail  a  particular  strategy 
that  appears  to  govern  several  decision  processes.  There  are  undoubtlv  decision 
processes  that  are  not  compatible  with  Pretree.  Some  of  them  could  perhaps 
be  explained  by  EBA,  while  others  may  require  different  theoretical  treatments. 
The  selection  of  a  choice  model,  however,  generally  involves  a  balance  between 
generality  or  scope  on  the  one  hand,  and  simplicity  or  predictive  power  on  the 
other.  Pretree  may  be  regarded  as  an  intermediate  model  that  is  much  less 
restrictive  than  CRM  since  it  is  compatible  with  the  similarity  hypothesis, 
yet  it  is  much  more  parsimonious  than  the  general  EBA  model  since  it  has  at 
most  2n  -  2  rather  than  2n  -  2  parameters. 

Furthermore,  the  tree  model  may  provide  a  useful  approximation  to  a  more 
complex  structure,  in  the  same  way  that  a  two  dimensional  solution  often  provides 
a  useful  representation  of  a  higher  dimensional  structure.  Consider,  for 
example,  a  person  who  is  about  to  take  a  one-week  trip  to  a  single  European 
country  and  is  offered  a  choice  between  France  (F)  and  Italy (I)  and  between  a 
luxury  tour(L)  and  an  economy  tour(E).  Naturally,  the  luxury  tour  is  much  more 
comfortable  but  also  considerably  more  expensive  than  the  economy  tour.  It 
is  easy  to  see  that  the  four  available  alternatives  F^.F^.I^jIg.  do  not  satisfy 
the  inclusion  rule  because,  for  any  triple,  each  alternative  shares  different 
aspects  with  the  other  two.  Hence,  the  EBA  model  cannot  be  reduced  to  a  tree 
in  this  case,  although  it  can  be  approximated  by  a  tree  —  provided  one  of  the 
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attributes  looms  much  larger  than  the  others. 

Suppose  the  decision  maker  Is  verv  concerned  about  the  site  of  the 

trip  (Italv  vs.  Franco)  but  Is  not  overlv  concerned  about  comfort  or  price. 

In  this  case,  the  weights  associated  with  the  tour-tvpe  (luxury  vs.  economy) 

would  be  small  In  comparison  with  the  weights  associated  with  the  sites. 

Hence,  the  observed  choice  probabilities  could  be  approximated  fatrlv  well 

bv  the  tree  (Fj F^)  (l^lp).  On  the  other  hand.  It  the  decision  maker  Is 

much  more  concerned  about  the  tvpe  of  tour  than  about  Its  site,  his  choice 

probabilities  will  be  better  described  bv  the  tree  (F  1  )  (F  .1  .) .  The 

LI.  h  1*. 

quality  of  either  approximation  depends  on  the  degree  to  which  one  attribute 
dominates  the  other,  and  It  could  be  assessed  direct Iv  bv  examining  the 
trlnarv  and  the  quarternarv  conditions.  An  extension  oi  (lie  tree  mode! 
that  deals  with  factor!  il  structures  will  ho  described  olevhero. 

Hierarchical  or  tree-tike  models  of  choice  have  been  recently 
employed  hv  students  ot  economics  and  market  research  who  Invest  Igate 
questions  such  as  the  share  ot  the  market  to  he  captured  hv  a  new  product  , 
or  t he  probability  that  a  consumer  will  switch  from  one  brand  to  another. 
Luce's  model  provides  the  simplest  answers  to  such  quest  ions,  hut  as  we 
have  already  seen.lt  Is  too  restrictive.  Perhaps  the  simplest  wav  ot 
extending  CRM  Is  to  assume  that  the  ottered  set  ot  alternatives  can  he 
partitioned  Into  classes  so  that  the  model  holds  within  each  homogeneous 
class,  even  though  It  does  not  hold  for  hot orogoneous  sets. 

This  assumption  underlies  tit*'  analysts  ot  brand  switching  developed 
bv  t  It**  llondrv  Corporation,  and  described  bv  Kalwani  and  Morrison  ( 1  *>  7  7 1 . 
According  to  the  llendrv  model,  the  probability  that  a  consumer  will 
purchase  a  now  brand  given  that  he  switched  from  his  old  one.  Is  prop¬ 
ortional  to  the  market  share  ot  t he  now  brand  -  provided  t ho  t wo  brands 
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belong  to  the  same  class  of  the  partition.  The  application  of  this  model, 
therefore,  requires  prior  identification  of  an  appropriate  partition,  or 
tree  structure,  that  is  presumably  constructed  on  the  basis  of  Informed 
intuition.  The  similarity-based  scaling  procedure  employed  In  this  paper, 
and  the  test  of  the  necessary  trlnary  and  quarternarv  conditions  could 
perhaps  be  used  to  construct  and  validate  the  partition  to  which  the 
analysis  of  brand  switching  Is  applied. 

The  partition  of  the  alternatives  into  homogeneous  classes  satis¬ 
fying  CRM  was  also  used  by  McFadden  (197b,  1*178)  in  his  theoretical 

and  empirical  analyses  of  probabilistic  choice.  As  an  economist,  McFadden 
was  primarily  Interested  in  aggregate  demand  for  alternatives  (e.g.,  different 
modes  of  transportation  )  as  a  function  of  measured  attributes  of  the 
alternatives  and  the  decision  makers  (e.g.,  cost,  travel  time,  income'. 

The  Thurstonlan,  or  random  utility,  model  provides  a  natural  framework 
for  such  an  analysis  which  assumes,  in  accord  with  classical  economic 
theory,  that  each  Individual  maximizes  his  utility  function  defined  over 
the  relevant  set  of  alternatives  and  the  random  component  reflects  the 
sampling  of  individuals  with  different  utility  functions. 

McFadden  (1°781  began  with  the  multinomial  logit  (Mh’L)  model  in  which 

P(x,A)  ■  expH  Xjt'j  /  Z  exp  Z  vi(,i 
1  ?eA  i 

where  x^...,xn  are  specified  attributes  of  x,  and  are  parameters 

estimated  from  the  data.  This  is  clearly  a  special  case  of  Luce's  model 

where  log  u(x)  is  a  linear  function  in  the  parameters  A. . ;'n>  Tt  is 

expressible  as  a  random  utility  model  bv  assuming  an  extreme  value  distribution 
F(t)  ■  exp  [-exp-(at+b)],a  '  0,  see  e.g..  Luce  (1*J77),  Yellott (1977) . 

The  MNL  model  lias  been  applied  to  several  economic  problems,  notably 
transportation  planning  (McFadden,  l*>7b) ,  but  the  failure  of  context- 
independence  led  McFadden  (l*1?.**)  to  develop  a  more  general  family  of  choice 
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models,  called  generalized  extreme  value  models,  that  are  compatible  with  the 
similarity  hypothesis.  One  model  from  this  family,  called  the  nested  logit 
model,  assumes  a  tree  structure  in  which  the  probabilities  of  choice  at  each 
level  of  the  tree  conform  to  the  multinomial  logit  model,  see  McFadden  (1978). 
Although  the  nested  logit  model  does  not  coincide  with  Pretree,  the  two  models 
are  sufficiently  close  that  the  former  may  be  regarded  as  a  random  utility 
counterpart  of  the  latter. 

Psychological  models  of  individual  choice  fall  into  three  overlapping 
classes:  decomposition  models,  probabilistic  models  and  process  models. 
Decomposition  models  express  the  overall  value  of  each  alternative  as  a  function 
of  the  scale  values  associated  with  its  components.  This  class  includes  all 
the  variations  of  expected  utility  theory  as  well  as  the  various  adding  and 
averaging  models.  Probabilistic  models  relate  choice  data  to  an  underlying 
value  structure  through  a  probabilistic  process.  The  models  of  Thurstone  and 
Luce  are  prominent  examples.  Process  models  attempt  to  capture  the  mental 
operations  that  are  performed  in  the  course  of  a  decision.  This  approach, 
pioneered  by  Simon,  has  led  to  the  development  of  computer  models  designed 
to  simulate  the  decision  making  process.  Pretree,  like  the  more  general 
EBA,  belongs  to  all  three  classes.  It  is  a  decomposition  model  that 
expresses  the  overall  value  of  an  alternative  as  an  additive  combination  of 
the  values  of  its  aspects.  Unlike  most  decomposition  models,  however,  the 
relation  between  the  observed  choice  and  the  underiving  value  structure  is 
probabilistic  ,  and  the  formal  theory  is  interpretable  as  a  process  model  of 
choice  behavior  that  is  based  on  successive  eliminations  following  a  tree 
structure. 

This  paper  exhibits  three  correspondence  relations  (i)  the  equivalence  of 
elimination-by-  tree  and  the  hierarchical  elimination  model,  (ii)  the  compat- 
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lbility  of  aggregate  choice  and  the  individual  EBA  model,  and  (iii)  the 
correspondence  between  preference  and  similarity  trees.  The  three  results, 
however,  have  different  theoretical  and  empirical  status.  The  equivalence  of 
EBT  and  HEM  is  a  mathematical  fact  that  permits  the  application  of  the  tree 
model  to  both  random  and  hierarchical  decision  processes.  The  second  result 
offers  a  new  Interpretation  of  EBA  as  an  aggregate  choice  model,  thereby 
providing  a  rationale  for  applying  EBA  to  aggregate  data.  Finally,  the  comp¬ 
atibility  of  similarity  and  preference  trees  is  an  empirical  observation  which 
suggests  that  the  two  processes  are  related  through  a  common  underlying  structure. 
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MATHEMATICAL  APPENDIX 
I.  Proof  ot  the  Structure  Theorem 

To  show  that  a  tree  representation  of  T*»{x'|xeT}  implies  the  inclusion 
rule,  let  t(x)  denote  the  path  from  the  root  of  the  tree  to  the  terminal 
node  associated  with  x.  For  any  x,  y,  z,  in  T  there  are  4  possible  tree 
structures,  and  they  all  satisfy  the  inclusion  rule  as  shown  below. 

a.  If  t(x)  and  t(v)  meet  below  t(z),  then  x'fl  y'o  x’nz ' . 

b.  If  t(x)  and  t(z)  meet  below  t(y),  then  x'o  z  bx'n  y ' . 

c.  If  t(y)  and  t(z)  meet  below  t(x),  then  x'nv'  *  x'.z'. 

d.  If  t(x),  t(y)  and  t(z)  all  meet  at  the  same  node  then  x’ny'  *  x'r.z'. 

In  order  to  establish  the  sufficiency  of  the  inclusion  rule,  let 

T  ■  {xeT|aex'},  and  let  S(T)  be  the  set  of  all  T  for  anv  a  in  T'.  To 

a  a 

prove  that  T*“{x'|xeT}  is  a  tree,  it  suffices  to  show  that  S(T)  is  a 

hierarchical  clustering.  That  is,  for  any  a, 3  in  T*  either  T 3 T  ,  or 

a  w 

Tg-T^,  or  T^ATg  is  empty.  Suppose  S(T)  is  not  a  hierarchical  clustering. 

Then  there  exist  some  distinct  aspects  a,  3  in  T'  and  some  x,  y,  z  in 

T  such  that  xeT  fl T  ,  veT  -T.  and  zeT  -T  .  Hence, a  is  included  in  x!Tiv', 
a  3  a  3  8  a 

/ 

Sis  included  in  x'nz',  but  a  is  not  included  in  z'  and  3  is  not  included  in 
y'.  Consequently ,  xYly'  neither  includes  nor  is  included  in  x'oz'  and 
the  inclusion  rule  is  violated,  as  required. 


I 
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II.  Proof  of  the  Equivalence  Theorem. 

(i)  EBT  implies  HEM. 

If  EBT  holds  for  T,  then  it  must  also  hold  for  any  AcT  with  the  induced 

tree  structure.  Hence,  it  suffices  to  demonstrate  the  first  two  parts  of  Equation  (3) 


(a)  If  Y | 8  and  B|a  then  P(Aa,Ay)  *  P(An,A,)P(A„,Av) . 


a’  S'  '•'V'V 


T  _  ,  .  P(A  .A  )  m(a) 

(b)  If  Y  1 0  and  y  |  a  then  - —  =  -  ,  provided  m(8)  i  0. 

P(A,.Ay)  m(£) 

We  begin  with  the  following  auxiliary  result.  If  3 |a,  then 

m(a) 


P(x,A  )  »  P(x, A  ) 

D  01 


m(8)-u(8) 


Let  alf...,an  be  a  sequence  of  links  leading  from  x  to  a.  That  is, 
^  *  (xi,  j  I »  i*l,. . .  ,n-l,  and  *  a.  Assuming  EBT  and  3  | a 


u(V 


P(x,Agl 


u(Vl} 


m(. 3)  -u(tJ) 


P  ( x  >  A  )♦ 


u(aj) 


P 1  x ,  A  1 


n  m(8)-u(8) 


u(°i  )  m(a  )-u(a  )  /u(a  ) 

- —  P(x,Aa  )  «■  -  n  '  n  1 


m(3) -u(B) 


n  m(.8)-u(8)  [  m(an)  -u(n  ) 


n-1  m(8)-u(.8) 

P  t  x ,  A  )  ♦ 


P,X.A  ) 


n-  1 


u(a.) 

♦  -  P(x, A  )) 

al 


m(an)-ulan) 


ulan) 


m(B) -u(3) 
m(a) 


P(x,A  )  * 
a 


nU>xn)-u(an) 


P(x,A  ) 
n  m( 8) -u(B)  n 


P(x>  A  ) 


m(8)-u(8) 

as  required.  To  prove  (h)  we  assume  that  y  J  3  and  y | a  ,  hence 
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P(x, A, 


which 
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using  the  inductive  hypothesis,  we  obtain 


“(Vi5 

- — -  P(x,Aa  ) 

m(an)-u(an)  n-1 


u(Vl3 


m(a  ) -u(a  ) 
n'  v  n 


mla  ,)-u(a  ) 

P(x,A  )  *  - — - P(X,A  ) 

n-1  mCan)  "  u(an)  n-1 


n-J 


P(x,Aa  )  + 


uCan-P 
m(an)-u(an) 

n-1 

Z  u(a  )  P(x,A  ) 
iml  1 _ j_ 

m(an)  -  u(an) 


,  .  .  ,  E  u(  a.)P(x,A  ) 

m(a  , ) -u(a  ,)  .  .  v  i'  v  a.-' 

n-1'  v  n-1  1*1  l 


n-1  m(an)  -  u(an) 


m(an-l)-u(an-l) 


is  the  recursive  expression  for  P(x,A  ). 
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Til.  Proof  of  the  Representation  Theorem. 

The  proof  is  divided  into  a  series  of  lemmas.  Let  P^  denote  the  set  of 
binary  choice  probabilities  defined  for  all  pairs  of  elements  in  T. 

Lemma  1:  If  T  *  {x,y,z},  then  Pj  satisfies  Pretree  with  (xy)z  iff  the  trinary 
inequality  (4)  is  satisfied  in  this  form. 

Proof:  Necessity  is  obvious.  To  prove  sufficiency,  we  use  the  notation  of 
Figure  5,  where  R(x,y)  >  1.  Set  a*  1,  &  *  R(y,x),  and  select  0a  0  so  that 
[R(x,z)  -  R(y,z)]9  *  R(y,z)  -  R(y,x)  R(x,z),  and  let  y  =  R(z,x)(l*0). 

(Note  that  when  R(x,y)  >  1,  9  is  uniquely  defined  and  positive,  and  when 
R(x,y)  *  1,  9  can  be  chosen  arbitrarily). 

Let  P.j.  be  the  set  of  binary  probabilities  obtained  by  using  the  above 
expressions  for  a,3,y,9  in  the  defining  equations  of  the  model.  It  can  be 
verified,  after  some  algebra,  that  =  P,j.  as  required. 

Before  we  »o  further, noce  tnat  if  P^  satisfies  Pretree  with  (xy)z  and 
R(x,y)  5  1  then  S/a  »  R(y,x).  Furthermore. 


6-fa 

e+s 


R (x, z) 

R(y.z) 


implies 


0_ 

a 


R(y,z)  -  R(y,x)R(x,z) 
R(x, z)  -  R(y, z) 


and 


a»9 

y 


*  R(x,z)  implies  — 
01 


nr.  -  R(y,.\)  R 

Ru’X)(i+"T,rxrzi-R(y.z 


Rix.z) 


)! 


1-R(v,x) 

Rfx,z)-R(v,z), 


Hence,  the  lengths  of  all  the  links  are  determined  up  to  multiplication 
by  a  positive  constant.  Furthermore,  the  present  model  readily  entails  the 
following  property. 
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Lemma  2:  Suppose  A  and  B  *  {.\,y,v}  are  sets  of  objects  such  that  y.veA 
and  x*A,  and  suppose  that  both  PA  and  Pg  satisfy  Pretree.  (It  is  assumed  that 
P(v,y)  is  the  same  in  both  structures).  Then  the  measures  on  A'  and  B'  can 
be  selected  so  that  u(v'-y')  --  as  well  as  u(y'-v')  --  are  the  same  in  both 
measures . 

Lemma  3:  Suppose  A  =  (x,y,v)  and  B  =  (y,v,w)  satisfy  Pretree,  with  representing 

measures  uA  and  Ug,  in  the  forms  (xy)v  and  (yv)w,  respectively.  If  C  *  AU  B  «  {x,y,v,w} 

satisfies  the  appropriate  quartemarv  condition  with  (xv)(v,w)  or  with  ((xy)v)w, 

then  there  exists  a  representing  measure  u  on  C'  which  extends  both  u.  and  u„. 

A  B 

Naturally,  we  assume  that  uA  and  uR  were  selected  according  to  Lemma  2. 

Proof:  Consider  the  form  (xy)(vw),  see  Figure  5a.  By  Lemma  2,  u^B  +  e)  =  Ug(8+6) 
and  u^X+y)  3  ur(X*y).  Hence,  uA  and  uR  can  be  used  to  define  a  measure  u  on  C' . 

To  show  that  u  is  a  representing  measure  on  C'  we  have  to  show  that  R(x,w)  = 
u(6*a)/u(X-f6)  .  Since  C  satisfies  Pretree,  it  follows  from  (5)  that 

R(x,w)  =  R(v,w)R(x,v)R(v,y) 

u(8*B)  u(a+8)  u(X*y) 
u(X>6)  u(X+y)  u(B+6) 


u(U6) 

Next,  consider  the  form  (lxy)v)w,  see  Figure  5b.  Here,  we  have  to  show  that 
R(x,w)  »  u  (a>B'*-\)/u  (3).  Applying  (6)  it  follows  that 


'  L. 
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u(6) 

Lemma  4:  P^.  satisfies  Pretree  with  a  specified  structure  iff  for  every  Sd,  with 

four  elements  or  less,  P<-  satisfies  Pretree  relative  to  the  same  structure. 

Proof:  Necessity  is  immediate.  Sufficiency  is  proved  tv  induction  on  the  cardinality 
of  T,  denoted  n.  Suppose  n  >  4,  and  assume  that  the  lemma  holds  for  any  cardinality 
smaller  than  n. 

Suppose  (xy)v  holds  for  any  v  in  T.  Let  A  =  T  -  {x>,  and  B  =  {x,y,v}.  By 
the  induction  hypothesis,  both  P^  and  P^  satisfy  Pretree  with  the  appropriate 
structure.  By  Lemma  2  we  can  assume,  with  no  loss  of  generality,  that  the  measures 
of  y  and  v  in  A'  coincide  with  their  measures  in  B'.  Since  any  aspect  in  T' 
appears  either  in  A'  or  in  B',  and  since  the  aspects  that  appear  in  both  trees 
have  the  same  measure,  we  can  define  the  measure  of  any  aspect  in  T'  by  its  measure 
in  A'  or  in  B' .  Letting  P  denote  the  calculated  binary  probability  function,  we 
show  that  P^  =  P.J.. 

Since  P,  =  P,  and  PD  =  PD,  it  remains  to  be  shown  that  P(.x,wl  =  P(x,wl  for 
A  A  b  b 

any  weT-B. 

Let  C  ■  {x,y,v,w) ,  which  satisfies  Pretree,  by  assumption,  with  either  (xyHvw) 
or  ((xy)v)w.  Since  C  *  BU{y,v,w),  Lemma  5  implies  that  the  representing  measure 

I 
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on  C'  coincides  with  the  restriction  to  C'  of  the  defined  measure  on  T'.  Hence, 
P(x,w)  ■  P(x,w)  as  required. 

In  conclusion.  Lemma  3  together  with  Lemma  1  show  that  the  trinary  and  the 
quartemary  conditions  are  necessary  and  sufficient  for  the  representation  of 
quadruples.  Lemma  4  shows  that  if  Pretree  is  satisfied  by  all  quadruples,  then 
it  is  satisfied  by  the  entire  object  set.  This  completes  the  proof  of  the  rep¬ 
resentation  theorem. 

IV.  Uniqueness  Considerations. 

It  follows  readily  from  the  representation  theorem  that,  given  a  tree 
structure,  the  measure  u  is  unique  up  to  multiplication  by  a  positive  constant 
except  in  the  case  where  ^ll^binary  choice  probabilities  equal  1/2.  We  show 
that  the  tree  structure  is  uniquely  determined  by  the  binary  and  the  trinary 
choice  probabilities,  but  not  by  the  binary  data  alone. 

To  show  that  binary  choice  probabilities  do  not  always  determine  a  unique 
tree  structure,  consider  two  different  trees  (xy)z  and  (yz)x,  and  let 
a, 3  ,  y  denote,  respectively,  the  unique  aspects  of  x,  v,  z,  let  9  denote  the 
aspects  shared  by  x  and  y,  and  let  X  denote  the  aspects  shared  by  y  and  z.  Let 
u  and  v  be  the  measures  associated  with  (xy)z  and  (yz)x,  respectively,  and 
suppose  that 

u(a)  =  2,  u(3)  =  1,  u(y)  =  1,  and  u(0)  =  2 
v(a)  =  8,  v(3)  *  3,  v(y)  =  1,  and  v(X)  =  1 
By  the  assumed  tree  structures  u(X)  =  v(8)  =  0.  It  is  easy  to  verify  that  the 
two  trees  yield  identical  binary  choice  probabilities:  P(x,y)  =  2/3,  P(y,z)  =  3/4, 
P(x,z)  -  4/5.  We  next  show  that  the  tree  structure  is  uniquely  determined  by  the 
binary  and  the  trinary  choice  probabilities,  provided  all  binary  probabilities 
are  non-zero.  Consider  a  tree  (xy)  z  with  a  measure  u,  and  aspects  a,S  ,  y,  9 
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defined  as  above.  Assume  u(a),  u(3),  u(y)  and  u(6)  are  nonzero.  It  follows 
from  (xy)z  that 

P(x,v)  „  u(g)  u(g)  +  u(6)u(g)/ (u(g)  +  u(3))  P(x,xyz) 

P(y,x)  u(3)  u(3)  +  u(0)u(3)/(u(g)  +  u(3))  “  P(v,xyz) 


Suppose  the  data  were  compatible  with  another  tree  structure,  say  (yz)x  with 
no  loss  of  generality.  By  the  same  argument 


P(y.z) 

P(z,y) 


P(y  fxyz)  ^  ancj  hence 
P(z,xyz) 


u(3)  +  u(9)  „  u(3)  +  u(e)u(3)/(u(q)  +  u(3>) 

u (y)  u(y) 

which  implies  u(a)  »  0  contrary  to  our  assumption.  Given  both  binary  and 
trinary  probabilities,  therefore,  the  structure  of  any  triple  and  hence  of  the 
entire  tree  is  uniquely  determined. 
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V.  Proof  of  the  Compatability  Theorem. 

It  follows  readily  from  HEM,  see  Equation  (3),  that 

P(x,A)  -  P(x,A.)P(A, ,A.)...P(A  ,,A  ) 

1  12  n-1  n 

for  some  sequence  A  , . . .  ,A  such  that  A  »A,  and  A  CA. . , ,  i*l,...,n-l.  Me 
In  n  i  i+l 

show  first  that  the  sequence  can  be  chosen  so  that  a^i+1,  1  <  i  <  n, 
where  a^  is  the  cardinality  of  A^.  This  condition  is  obviously  satisfied 
in  a  binary  tree  where  each  node  joins  at  most  two  links.  Suppose  then 
that  the  tree  contains  three  links  that  meet  at  the  same  node,  e.g., 

5  |y»  5  | 8  and  5  |a.  Hence,  by  part  (b)  of  Equation  (3), 

PfA  AW  rc(a)  =  m(ct)  m(q)+m(B)  = 

a’  a'  m(a)+m(f3  )+m(y)  m(a)+m(8)  m(a)+m(8  )-hn(y) 

•  p(VAauVp(V’VV' 

and  the  result  is  readily  extended  to  nodes  with  k  links.  Under  Pretree, 

therefore,  P(x,A)  is  expressible  as  a  product  where  each  factor  PCA^.A^^) 

is  a  probability  of  choosing  between  two  branches. 

Under  Equation  (13),  the  probability  of  selecting  x  from  A  under 

a  specified  agenda  equals  P(x,B, )P(B, ,B„). . .P(B  ,A) .  for  some  B  rS,  ...“3  -A 

112  m  1  2  m 

By  compatibility,  there  exists  a  tree  and  hence  a  binary  tree  that  refines 

both  the  adenda  and  the  intrinsic  tree  structure.  By  the  above  argument, 

P(x,A)  is  expressible  as  a  product  P(x,A, )P(A, ,A.) . . .P(A  , ,A  )  where  a  =i+l 

1  <  i  <  n,  corresponding  to  a  binary  tree  that  refines  both  structures. 

Thus,  each  B  ,  j*l-, . . .  ,m,  appears  among  the  A  ,  ,  i-l,...,n.  Suppose 
J  IS 

Bj  =*  Ai  and  B^+1  =  Ai+t>  hence 

i+t-1 

P<BJ'Bj*1>  *  p<Wc)  '  "  P<Wl>'  “d 

K  1 
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P(x,A)  -  P(x,A. )P(A  ,A0) . . .P(A  , ,A  )  -  P(x,B,)P(B, ,B.) . . .P(B  ,A). 

llii  n-ln  1  1  Z  m 

Hence,  choice  probability  is  unaffected  by  an  agenda  that  is  compatible  with 
the  intrinsic  structure  of  a  preference  tree. 

If  the  agenda  is  not  compatible  with  the  intrinsic  tree,  there  exists 
some  x,y,z  in  T  such  that  both  (xy)z  and  [xzly  hold.  It  is  easy  to  verify 
(see  the  discussion  in  the  text)  that  P(x,xvz)  +  P(x,[xz]v)  in  this  case, 
which  establishes  the  necessity  of  the  compatability  condition. 
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The  present  notion  of  a  preference  tree  should  be  distinguished  from 

the  concept  of  a  decision  tree,  commonly  used  in  the  analysis  of  decisions 

under  uncertainty. 

2 

To  obtain  compact  figures  we  use  a  heavy  line  (see  Figure  7)  to 
indicate  double  lenght,  and  an  extra  heavy  line  (see  Figure  11)  to  indicate 
ten-fold  length. 
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Figure  l.  Schematic  representation  of  three  alternatives. 

Figure  2.  Tree  representation  of  the  choice  among  entrees. 

Figure  3.  An  illustration  of  the  inclusion  rule  x'Oybx'nz' 

(a)  as  a  Venn-diagram,  (b)  as  a  tree. 

Figure  4.  A  preference  tree  for  the  choice  among  modes  of  transportation 
Figure  5.  A  preference  tree  for  three  alternatives. 

Figure  6.  Preference  trees  for  four  alternatives. 

Figure  7.  Preference  tree  for  choice  among  celebrities. 

Figure  8.  Additive  tree  (ADDTREE)  representation  of  the  similarities 
between  Swedish  political  parties. 

Figure  9.  Preference  tree  for  choice  among  Swedish  political  parties. 

Figure  10.  Preference  tree  for  choice  among  Italian  political  parties. 

Figure  11.  Preference  tree  for  choice  among  social  sciences. 

Figure  12.  A  schematic  preference  tree  for  the  choice  between  shades 
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A  preference  tree  for  the  choice  among  modes  of  transportation. 


I 


Figure  7.  Preference  tree  for  choice  among  celbrities 
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Figure  9.  Preference  tree  for  choice  among  Swedish  political  parties. 
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